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Summary 


INTRODUCTION. 


No attempts have been made as far as I know to calculate special formulae for 
the standard deviations of fraternal and parental correlation coefficients. The 
usual formula for the standard deviation of a correlation coefficient* which is 
deduced on the supposition that the values of the same variable are mutually 
uncorrelated is generally used also for this case, although it is only correct for a 


* Vide Pearson and Filon: Phil. Trans. Vol. 191 4, p. 229, 1898. 
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fraternal correlation coefficient calculated from only two siblings of each family and 
for a parental correlation coefficient when only one offspring value from each 
family enters into the calculation. When the material of observation, as is usually 
the case in investigations of inheritance in higher mammals, consists of families of 
varying size, and correlation tables are used in which the same weight is given to 
each observed pair of siblings or pair of parent and offspring, without regard to the 
size of the family, a rational treatment of the probable error is excluded at the 
outset. With material in hand which makes it possible to examine numerous 
siblings, it is most reasonable to confine the investigation to a constant number of 
offspring from each family. In this case the deduction of formulae for the standard 
deviations of the two correlation coefficients does not present special difficulties, and 
this problem will be solved here. 


We shall suppose that each group of q siblings belongs to the same litter or 
that from other reasons their order of birth is indifferent. Then each pair of 
siblings or each pair of parent and offspring ought to take a like part in the 
calculation, and q siblings give rise to $q (q — 1) pair of brothers and q pair of 
parent and offspring which all of them are entered in the calculation. 


The fraternal correlation can thus be calculated either from a correlation table 
which is made symmetrical so that it contains g (q¢— 1) entries from each fraternity 
or by the formula quoted p. 10 which gives an identical result. 


I; FRATERNAL CORRELATION. 


Although this investigation aims especially at fraternal correlation it concerns 
of course other calculations of correlation in which the material consists of classes 
of equal size inside which the individuals are mutually correlated, all of them 
forming like parts. In the following we shall therefore name a group of siblings 
a class. 


Suppose we have a material consisting of q individuals from each of n classes 
inside which the individuals are correlated while individuals from different classes 
are uncorrelated. We can then consider such a material as one of many possible 
samples of the same nature and size drawn from a population consisting of classes 
of individuals correlated as mentioned. It is therefore possible to face the problem 
of finding the law of errors for the mean value, the standard deviation of the 
character concerned and further for the correlation coefficient inside a class, supposing 
that these are all calculated from a sample like the one now considered. 


Let the sample be 7, Ys, Ys +++ Yng With mean value 7 and standard deviation o. 
No special notation will be introduced for individuals of the same class, but summa- 
tion of products is indicated by } when all factors of the product belong to the 
same class, and by S when factors of the same product belong to two or more 
classes. The summations always extend to all 1 classes. 
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(a) The Mean Value. 


For the sample in hand we have 


The mean value of ¥ for a great number of samples coincides, according to the 
suppositions, with the mean of the population and this we choose for the zero point 
of y. The squared standard deviation of ¥ is therefore found simply by squaring 
the expression above, summing for all the samples imagined and taking the mean 
value of the result. We thus find 

oy = a {= (yt) + 2E (ye) + 28 (yr yd}, 
where a bar above a summation indicates that the mean value has to be taken 
of the sums for all samples, ie. for the population. Let the standard deviation 
of the population be s and the correlation coefficient for individuals of the same 
class 7, we then have 

= (y2) = ngs? 


and > (yrye) = $ng (q— Les’ 


As individuals of different classes are uncorrelated > (yi. Ys) is equal to 0, and 
accordingly we find 


2 


aaa Ss an ml * 
a7 Mie AGE Oty ven Pee eiewustlsends goa conds (1). 


This contains s and r for the population, which are, as a rule, only known from’ 
the sample in hand. It will be seen in the following, what is the approximation 
obtained by putting s and 7 equal to the values found from the sample. 


(b) The Mean Value of o?—the presumptive Standard Deviation. 


For our sample we find 


By taking the mean of o? for a great number of samples we find from this, 


9 


remembering that the mean of 7? equals a;,’, 


— = 2\ 
‘gaa (1 A eee (3). 
nq / 


When we take the value found for c? as an approximation to o?, we find accord- 
ingly the. presumptive value of the standard deviation of the population by the 
formula 
2 nq 
ng —{1+(q—D)r}’ 


which for r=0 or g=1 takes the form known for uncorrelated observations. 


pTr=S =a 


* Vide Comptes-Rendus des Trav. du Lab, Carlsberg, Vol. x1v. No. 11, 1921, Copenhagen, p. 32. 
1—2 
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For the s.p. of 7 we find by introducing s in (1) 


i ee 
id ng—{l+(q—-1)r} 


(c) The Standard Deviation of o°. 


The s.p. of the o? of our sample is found from o,°? = o4 —(o”)*, where the latter 
term is already known. From (2) we find for the calculation of o* 


wgra? = (ng —1) = (y2) — 23 (Yo) — 2S (YrYo) ceveeveeerecees (4), 


and from this 
niqtat = (ng — 1) (S (yi?) + 4 (E (miys))? + 4S (iyo)? + 
= 4 (ng = 1) Ey) = (Hye) — 4 (nq = 1) = (y2) Sys) + 8E (ys) S Gnys) »--).- 


For the calculation of the mean values contained in this equation, the six pro- 
ducts of product sums must be examined. We find 


(3 (y.))? == (ys*) +23 (4? yo") + 28 (4° Ye") 

(Z(yryo)P = EB CyPys?) + AE (y'yoys) + OE (Wyoysys) + 2S (Yryoysys)™ b+ (6). 

> (1) > (WY) => Ca) +2 (4? YoYs) +8 (Yr? YoYs) 

When the multiplication of products containing the factor S (y,y.) is carried out, 
it is clear that we need not consider such sums of products where the product con- 
tains a factor which is uncorrelated with all the other factors of the product, 
because the mean values of such product sums are 0. In the products & (y,?) S (yy) 
and = (yy) S(yi42) all the sums of products are of this kind, the factors being distri- 


buted either in two classes of which one contains 3 and the other 1 factor or in 
three classes with respectively 2, 1 and 1 in each. 


We therefore find 
OS (ya) = S(ylys) + 28 (qiiysts) + £8 Gn yaysys) + 
V(y2)8 (ys) =e ‘ BS Son, @®) 
= (WY)S (YiYs) = Os, 
where the mean values of the a’s for the population are 0. 


Let us denote the product moment corresponding to y%™"y"y? ysl by Binnpg if 
all factors belong to the same class and in the opposite case let us insert ‘d’ or ‘s’ 
as denoting different or same class. 


* In the sums S all factors of a product are supposed to belong to different classes except those which 
are denoted by an ‘s’ inserted between them, as belonging to the same class. 
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We find then 
Sp) = 7g 8, 
S(yiyo) =ng(q-1) Ba 
S(yry?) =3ng(q-1) Bx 


> (y2yoys) =4ng (q—1) (q—-2) Bu 


> (Yo Y3Ys) a gznq (g aM 1) (q > 2) (qg a 3) Bun r 


S(y2y2) =4n(n-l¢ Bas 
Seyemge) =4n(n- Ye G-1Be 


S(nysyeys) = kn (@—WEQ~1W Bir 


) 


Till now no suppositions have been made as to the law of distribution of the 
ys, but in the following calculation we shall suppose that the distribution is normal 
and the correlation between individuals of the same class normal. 


For the general case of normal correlation between n variables the product 
moments have been determined by Sverker Bergstrém*. Taking the standard 
deviations as units of the variable and denoting the correlation coefficients by 
Tyo, Yo..., Where for instance r,, means the correlation coefficient between the 2nd 
and 38rd variable of a product moment B'inyq, he finds the following formulae for 


the product moments of the 4th order : 
By =3 
B’a = Bs = 3r. 
Bn =2ry+1 
Bon = Wye + Mo | 
Bonn — (alsa ligt ae T Vues 


Se ee (9). 


Substituting our special values for the correlation coefficient we find 


Bi = 3st ) 

Ba = 3rs* | 

Bx = (2r° + 1) s | 
Bre 7 (leer yiss| 

Sica = eure tetas 

and further : 
Boo =5' 
d 


Boia =1s* 
d 7 


a) 


ere 
B11 11= 7's 
sas ) 


We are now by means of (8) and (10) in a position to evaluate the mean values 


of the products put down under (6) and (7). 


* Vide S. Bergstrém: Biometrika, Vol. x11. 1918, p. 177. 
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We find 
(= (y2)P = nq {nqg+2+2(q—1)7°}s! 
(> (mye)! =4$nq(q—1){14+2(q—2) r+ [dng (q— 1) + @?-3¢q+ 3] 7°} 54 
S ye > (4.Yo) =4 ng (¢ —1) {nq+44+2(q—2)r} rst (11). 
(S (440)? eee 


= (yr)S (ye) =O and > (Hy) S (Hy) = 
The calculation of «4 may now be continued. We find, by substituting the 
above mean values in (5), 
neqrot = s4 {n?g? — 1 — 2 (nq +1) (q—1 r+ (q—1)[2nq-(G—-1)] 7}. 
From (3) is found 
n2g? (o?)? = s* {n2g?— 2ng+1—2(nq—1)\(q—Drt+(q—l? rh, 


and accordingly 


os 24 
Coe \Oe)) — ae ing-1—2(q-1)r+(q-1)(mq-¢4+1)7%}, 


ne”g 


or arranged according to powers of nq 
2s! 
= ng {1+q- L732 == jute (q- yr BobnGd60n 600000 (12). 
This formula for the s.D. of the eat standard deviations is thus exact, 


supposing that the correlation be normal. 


For great values of n or rather of ——— ace may consider the s.D. of o? a 
1+(q-1)” 

differential, so that 
= a* + 80° =o? + 2a8o. 


From 607 = 2060 we find by squaring and taking mean value for a great number 
of samples, 
Gen — oon 


and by substituting the value of o?2, omitting the last term, 


pa 2) OR +(q—1)7°}, 


2 we2ig 
or, as with the accuracy obtainable we have 
Salon 
it follows that : oo = ng {1+(q—1)7r°}. 


We notice when comparing this formula with (1) that only for r=1 and r=0 
does the rule 
Ce =AGqe 
hold good. 


7 
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The fraternal correlation coefficient p for the present sample is, when all the 
4q (q — 1) pairs of siblings are used for the calculation, defined by 


P= sep 


o- 


where > (Yo) — y Dieta t SONS acnoies FING (13). 


I 2 
ng (q-1) 
To determine the 8.D. of p one requires in addition to o%», the s.D. of I and the 
product moment for II and o°. 


(d) Mean Value and Standard Deviation of the Product Moment 11. 
Taking mean value of (13) for a great number of samples we find as 
= (rye) = eng (q—1) rs? and (9?) =a;°, 
— 1 
= sir — — el) | Pcceuves ice astan hueshra ss 14). 
iis \ mae rh} (14) 


For calculating the mean value of II? (13) may be written 

mg? (q —1) T= —(g—1) = (yi?) +2 (nq—9 +1) 2 (Hy) — 2(9-— DS (jy) 15), 
from which follows | 

nig’ (q—1P TP = (q—-1 (2 YDP + 4 (mq— 9 + IPS (Hy)? + 

— 4 (g— 1) (ng— gt DEYDE (Hy) +4 (G- 1TYES (Hy)¥ 

the mean values of the two products being 0 according to (11). Substituting the 
rest of the values from (11) we find 

(q— 1) n’g? IP = s* (2ng — (q —1) + 2r [ng (q- 3)-(Q-1¥] 

+7 [ne (gq —1)— 2nqg (q—-2)—(q—- 1)’ J}... (16), 


and by squaring (14) is found 
(q — 1) wg? (I)’ = s* fg —1 — 2r [nq(q- 1) -Q-1)] 
+ 7° [n'g? (q —1)— 2ng(q— 1)? + (q— Dh}. 
By subtraction of this equation from (16) we arrive at 
ae {1 - op lu 2 as 


~ ng (q—1) 
+7? | 34 +34 ale 


or arranged according to ng 
a 2s* 

ong (q— 1) 

which may also be written 


2 2s* 2 9 4 =I By lod 
eu= apy ltr g- 2h era Da [1 +r(q- prt sales): 


iow. te | 
! at ong = 2) 92 (q- = ae [Eng = DI. 


8 Fraternal and Parental Correlation Coefficients 


(e) The Product Moment, Up.:, of I and o°. 


By multiplication of (4) and (15) and taking mean value for a great number of 
samples we find for the mean value of the product Io? 


nig’ (q— 1) To? == (ng — 1) (q—1) (3 YP — 4 (mg — G+ 1) (2 (yr)? 
+2 {n'q?— ng + 2(q— 1} EQ) E (mys) + 4(q—-D) S(ny) 


the mean values of the two products being zero according to (11). Introducing the 
rest of the mean values from (11), we have 


neg? Io? = st {—ng — 1 +r [n'g? — ng (q—4) — 2g —1)] +7" [ng (q - 3) - (q— DJ} 
From (3) and (14) is found 
wg? Io? = st {—ng+1+ 47 [ng —ng?+2(q—1)] +r [-ng(q-]+(q-DI. 
As Tyo: = Ho? —T. 0, 


it follows from the two foregoing equations that 


D o4 : 
Un: = ae {— 1+ 2r [ng —(q— 1)] +7? [nq (q — 2) —(q—1)"]}, 
2s4 1 
a nee 5 \r 2+(@-2)*]-, +@-1) rt {oe (18). 


(f) The Standard Deviation of the Fraternal Correlation Coefficient. 
If the sample is great in proportion to (¢—1)r the errors of II and o? can be 
treated as differentials and we have for the correlation coefficient calculated from a 


sample = & 
+61 I 1 Tien 
Po hot BT et 
. Th Page GD 
and p= == : 


1 


and therefore neglecting the term containing — which according to these supposi- 
c c nq 


tions cannot be evaluated 


—7 


From 6p = : {etl — a bor we find by squaring and forming mean value 
aD 


= 
Pee? eee un S 
C5 i Oar + (=) Og = 2 Te: K 
(ay a" o" 


When the values from (3), (12), (14), (17) and (18) are introduced in this 


formula and the terms containing the higher power of — are neglected, we get 


nq 
2Qr? 4r? 
et | = Th lens eae 119) = , 
vy (L+(q— Lt} — FF (2+ G2), 


{[l4+r(q —2)P+7?(q—1)}+ 7 


2 
SAGs! 


On 
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from which is found 


oO = A=) {L+r(q-2)—7r(q—-1)}? 
ao 
and waa) alan) (t+ G-Dn aha sistokssetavaversisie wisleie’e (19). 


For g=2 this formula coincides with the usual formula for the standard 
deviation of a correlation coefficient calculated from two series of values of two 
variables corresponding in pairs, the values of each series being mutually uncorre- 
lated. 


(g) Numerical Evaluation of the Formula for the s.D. of a Fraternal 
Correlation Coefficient. 


The number, VN, of observed pairs of observations being equal to $nq(q — 1) the 
formula (19) may also be written 
I 


a= Ty (l— ntl +g - Dr. 


Comparing materials of observations with different number of siblings g, we 
see that for the calculation of fraternal correlation information of each available 
pair of siblings has a value inversely proportional to {1+(q—1)r}*% The ratio 

l+r - : 
Vg = GS) serves as a measure for the value which must be attributed to 
1l+(q-l1)r/ _ 
information of an observed pair among q siblings, supposed that all of the 4nq(q—1) 
pair of siblings are used for the calculation, and supposed that the value of infor- 
mation of a pair of siblings for g=2 is put equal to 1. On the other hand 7 
q 
indicates the ratio between the numbers of pairs of siblings which are required for 
obtaining the same accuracy in the correlation coefficient in the case of q and in 
the case of two siblings from each family. Table I gives the numerical values of v 


for different values of 7 and q. 
TABLE I. 


alder Sor : 


q r=01 | 02 0:3 0-4 0:5 0:6 0:7 08 | O9 


1°000 1:000 1-000 1-000 1°000 1°000 1-000 1:000 1-000 


2 

3 *840 °735 *660 605 563 529 “502 ‘479 =| ~=*460 
4 “716 563 468 406 360 327 “301 280 264 
5 617 “444 B49 ‘290, 250 221 *200 184 Al y/l 
6 538 360 *2'70 218 184 “160 143 130 SILLS) 
tl “473 298 216 “170 141 | 1121 ‘107 ‘096 088 
8 *419 *250 “176 136 ‘111 =| = =-095 083 ‘O74 068 
9 373 213 146 el 090 ‘076 066 059 054 
10 


335 184 123 093 ‘O74 063 054 048 | 044 
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We notice that for values of 7 somewhat greater than 0°5, such as are usually 
found for mammals, v7, has already decreased to about $ and v, to about 4. By 
giving the same weight to each pair of siblings when forming fraternal correlation 
tables from a material consisting of fraternities of different size, we therefore fai! 
very largely to pay due regard to the observations. With material under conside- 
ration, as for example anthropometric data, which according to its nature consists of 
small groups of siblings of varying number, and which is not so numerous that we 
can afford to omit observations from the calculation to make qg constant for all 
fraternities, the rational proceeding must be to sort the material according to the 
number of siblings and calculate the correlation coefficient of each group separately. 

It is then possible to effect considerable saving of time and labour in the 
investigation of correlation by avoiding the forming of fraternal correlation tables 
and using instead the formula 


ae oq * 
ail =i) : 


where a, is the directly calculated s.p. for mean values of fraternities. The results 
found by the formula are identical with those of the defining formula, so that the 
only objection to this method of calculation is the lack of opportunity to examine 
the shape of the regression curve. 


From the correlation coefficients found for different values of qt, it 1s finally 
possible with knowledge of their s.D.’s to calculate a mean value of the fraternal 
correlation coefficient and its s.D. 


In investigations of inheritance with animals with numerous offspring, where a 
great number of siblings are available, we have to face the problem of deciding 
what number of siblings it is profitable to employ for the investigation. 


We shail state provisionally the problem as follows: with which value of g do 
we, provided the number of examined offspring individuals (nq) be fixed, obtain the 
most accurately determined fraternal correlation coefficient ? Or in other words for 


. 


which value of ¢ 1s 
1 was 
rest {l+7r(q—1)? a minimum ? 
* Vide K. Smith, Comptes-Rendus des Trav. du Lab. Carlsb, Vol. xtv. No. 11, 1921, p. 8, where the 
formula is deduced for the special case q=10. 


+ In the memoir quoted it is shewn (p. 29) that the above formula may also be written 


q hq 
| eee 
J q-1 «@’ = 
0’; being the squared s.p. inside fraternities of q siblings and being calculated as a mean of such 
values obtained from each of the n fraternities. We may here instead of oq introduce the pre- 
sumptive s.p. inside a fraternity ,o, that is the s.p. we expect to find in fraternities consisting of 


a great number of siblings. The relation is 


-so that we find r=1-"2, 
o 


which shews that the value of r arrived at must be expected to be independent of q. 
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The condition of minimum is 
q=1+ ra 
_ Corresponding to the values 4, 4, $ and 2 for r the values of q are 5, 4, 3 
and &. 

In examining the question of the most profitable number of siblings, attention 
must also be paid to the determination of the parental correlation and the question 
will therefore be further discussed in the following section. Besides it cannot be left 
out of consideration that, as a rule, it will be easier to examine the same number 
of individuals distributed among a smaller than among a greater number of frater- 
nities. When regard only is had to fraternal correlation, the values of q obtained 
above must therefore be considered the minimum values. 

For a more detailed illustration of the variation of the s.p. of the fraternal 
correlation coefficient with the number of siblings Table IT has been calculated. 
The table gives the values of the s.p. for 1000 observations distributed among from 
500 to 100 fraternities, the sizes of which therefore vary from 2 to 10. 


TABLE II. 


The Standard Deviation of a Fraternal Correlation Coefficient 
calculated from 1000 observed Individuals. 


q r=} pe ees | T=3 
2 0419 ~—sC| 0398 | 0335 0286 
3 0356 0351 0316 0278 
4 0339 0344 ‘0323 0289 
5 0335 0348 0335 | 0304 
6 0337 0356 0350 0320 
Ff 0342 0365 0365 0336 
8 0349 0376 | 0380 0352 
9 0356 0387 0395 0367 
10 0363 0398 0410 0382 


The table does not show a rapid increase of the s.D. when the number of siblings 
increases beyond the most profitable number found above. Buta comparison of the 
values for ¢g=5 and for g=10 still shows that the latter are respectively 8°/,, 
14°/,, 22°/, and 25°/, greater than the former, so that when there are 10 siblings 
in each fraternity respectively 18°/,, 31°/,, 50°/, and 58°/, more individuals are 
required to obtain the same accuracy than when there are only 5 siblings from each 
family. 


(h) Application of the Formula to previous Calculations of Correlation. 


In an investigation* concerning the characters, nwmber of vertebrae (‘ Vert.’), 
number of rays in the pectoral fins (‘ Pd? and ‘ Ps.’) and number of pigment spots 
(‘Pigm.’) in Zoarces viviparus from the station Nakkehage in Isefjord, Denmark, 


* K. Smith, Comptes-Rendus des Trav. du Lab. Carlsberg, Vol. x1v. No. 11, 1921. 
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the fraternal correlation coefticient was calculated for 6 (for pigment spot only 5) 
samples from different years consisting of fraternities of 10 siblings. In this case 
the probable error of the fraternal correlation coefficient is according to (19) 
067449 
P.E. (r) = ——— 
V45n 
Table III gives for each sample the values of n, 7 and P.E. (7), as well as 7 for 
ull the samples each weighted according to the s.D. 


TABLE IIL. 


Fraternal Correlation. 


(l—r)(1+97r). 


Vert. Pads Pigm. 
Year when 
sample 
taken n TEP.E n rP.E. n reP.E. 


From total 


0:4689 + 0095 


| 1914 138 0°4590 + 0238 132 0°3169 + 0231 -- — 
1915 168 0°4693 + 0215 174 0°4196 + 0211 75 0°3175 + 0306 
1916 123 0°5108 + 0248 122 0°3985 + 0251 87 0°3418 + 0289 
| 1917 177 0°4715 + 0209 176 03634 + ‘0206 127 0°4112 + 0247 
1918 153 0:4801 + 0225 156 0°3329 + 0215 113 0°3074 + 0247 
1919 98 | 0:4066+:0281 98 0:2893 + -0260 86 0°3722 + ‘0296 


0°3564 + 0092 


0°3517 + 0122 


samples 


—— | 


For the mean values of 7 probable errors have previously been calculated based 
on the 6 or 5 values found. These probable errors had for 


Vert. Pd. and _ Pigm. respectively 
0:0094 0:0137 and 0:0128, 


which for Vert. and Pigm. agree extremely well with the theoretical values now 
found, while for Pd. the error had been estimated somewhat too great. 


the values 


II. PARENTAL CORRELATION. 


For investigation of parental correlation we have a sample consisting as above 
of nq offspring values Y, Yo, Ys eure: Ynq Cistributed in n classes with g in each, and 
in addition, containing for each class an observed parental value v We aim at 
finding the correlation between «# and y’s of the same class. 

Let the parental correlation be r, and the s.D. for a’s s’ in the population which 
we may imagine that the sample represents, and let us choose the mean value of 
the population as zero point for a. 

The parental correlation coefficient is from the sample determined by 


= Tey 
Pp acme: 


aco 
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where o” is the s.D. of w calculated from the sample, and I,, is the product moment 
for # and y determined by 


bie = Day a0 Ap ees ao ees ee eT (20). 


As in the previous section = ne a sum of products each of which consists 
of factors from the same class. In the sums S each product contains factors from 
at least two classes, and when two factors belong to the same class it is indicated 
by an ‘s’ inserted between them. 

For evaluation of the standard deviation of p, the s.p. of H,,, and o’ are 
required, as well as the product moments for each pair of these three functions. 


(a) Mean Value and Standard Demation of the Product Moment U,,. 
The equation (20) may also be written 
n— 


Ua way 2 OT wag S (nn) Cr (21). 
By taking the mean value for a ae ae e samples we therefore find 
= —] 
Hee ee rh gt Wahi ugenun (22). 


n 
From (21) we find by squaring and taking mean value 


nig? IPzy = (n — 1) (= (x9) + (S (asm) — 2(n-1) 3 (ay) S( )S (ay) «--(28). 
Together with the determination of the mean values occurring here, we shall 
determine the other mean values of products required for the evaluation of c,,- Lhey 


are such as arise from multiplication of = (a,y,) and S(a,y,) with each of the two 
groups = (y,”), & (my), S(yiyo) and X (x2), S(a,2,) and also those which contain a 
factoy of each of the two latter groups. As in the foregoing section, we need, 
however, not consider products of a } and an S, because such products may be 
developed into sums of products all containing a factor uncorrelated with all the 
other factors of the product, from which it follows that the mean value for a 
great number of samples is zero for each of these sums of products. It remains 
to determine the following products : 


3 (a4) = (a y;") + 23 (x71 Yo) + 28 (a, YrLr Yo) 


\ 
OS (a1) =S (ary) + 28 ary, yo) + 25 (a 3 12 Yo) + € 
Z(H) (y2) == (my?) += (c.9.2y,) +8 (oa nye) | 

& (19) & (Yayo) => (wi ys Yo) + 3E (iy. Yoys) +8 (21 Yo Ys) 
S (aay) S (Hy) =S (a ye) + 2S (ar rye Ys) + & s 
(ag) Set) =E (wy) +8 (ete ys) 

S (ay1) S (aa) = S (ara, 1) + € 

(am) (yr?) = (ay) +8 (ay?) 

(av) (Wyo) =Z (a2 yp yo) + S (wy? th Ys) 

SH CHAI OHOD aes) (t 9p & Yo) + € 


Leen (24). 
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€, &, €, and e, are sums, the means of which are 0. The product moments 
are, as in the previous section, denoted by 8 and the indices concerning “ a’s” are 


placed in front of 8, for instance = > (ay) is denoted by .8,. We thus find for 


the mean values of the sums occurring in (24): 


> (a3 4;) = 0} 3By 
Dey) 1948s 
> (“741 Y2) = eng (q¢ —1)28u 
= (a, in) = ng iBs 
> (x, Yr? Yo) = ng (q i 1) 18a 
> (1%): Yo¥s) = ong (q— 1) (¢-2)i1Bm 
S (,?2241) =n(n—1)¢q 2B 
S(aPy") =n(n—-1)¢2h2 
d 

S (ey, Yo) = fn (n-1)¢(q- 1) 2B a 

s ds 
S (a YX, Yo) = an (n =o 1) V1 Pi i 

Ss s dsd 

S (a, WY) =n (n im 1) g'1 Bh 2 


8 (a gaye y= 4m — 1) @(Q= 1) Bay 


sds ) 


From Bergstrém’s formulae (9) we find, when introducing r,, 7 and O for the 


p> 


correlation coefficients and remembering that in his formulae s and s’ are taken as 


units for y and #: 


s6i~ =387,S?s } 
2B. = (2r,?+ 1) ss? 
2Bn = (2r,? +1) 82s? 
183 = 8r,s's? 
18a ==Tp(1+4 2r)s‘s? 
Bn = 38rr,s's 


U 
218; = pss 
ds 


: OU feed 
2B, = =ss? 
d 
19.42 
2811 =1s'%s? 
d's 
9 Jo 9 
11a = Fs s 
asad 
iPia =%pSS 
S @ 
18111 =1Tps's? 
sds J} 


* In this single case the notation fails, as it ought to be indicated that the first « and the last y 


belong to the same class. 
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Applying (25) and (26) we find for the mean values of the products under 
(24) the following values : 
(2 (am)? = ng {gy (n+1)r2+14+(q-1)r}s?s* 1 
(Slam) = ng (w= 1) (gr? +1 + (9 - Vr} 88" 
> (2%) B(y2) =ng {ngt+ 24+2(¢—1)r} rps's* 
> (4h) & (ys) = 4ng (q— 1) (2 + (ng + 2q — 2) 7} rps's* 
S(ay) Sno) =n (n— 1) {1+ (G— 1) rj} rps’s? 


= (ay) > (a2) = nq (n 4b 2) p88 


somber ee C20). 


S (a@,y1) S (a8) = n(n — 1) grys’s 
> (a2) S(y2) = ng (n+ 2r,?) ss? 
= (a) = (Hye) = eng (q — 1) [2ry? + nr} 8s" 


= S (a0) S (Y, Yo) = ¥n (n — 1) @’r,?s?s? ) 


We may now continue the calculation of I’,,. Introducing the mean values 
in (23), we get é 
ng IT? = (nm — 1) {ngr,? + 1 + (¢—1)7} 828%. 
From (22) we find . 
nq (Ixy)? = (n — 1)? gry?s"s*, 


and when this equation is subtracted from the foregoing 
2 The TI 9 Mae i f 2 ) 6/202 
Oty = Wry — Hay)? = 9 (One + Lar (d= Wyss? Gck.as. (28). 
(b) The Product Moment, Woes o, Of Izy and -o*. 
Multiplication of (4) and (21) gives 


nq? Tay. 0? = (ng — 1) (n—1) 2 (y2) 2 (ay) — 2 (2-1) 2 (ay) & (Hye) 
+ 28 (ayy) SY. Yo) +, 


where y, consists’ of terms S x =, the mean values of which are zero. 
Taking the mean value and applying (27) we therefore find 
nig? gy. 0? = (n— 1) ry {ng (ng +1) + rng (q— 1)} 8's". 
For Ip, .o? we get from (3) and (22) 
q" Tay. 62 = (n — 1) ry {ng (ng —1) — rng (q — 1)} 8's, 
and accordingly from the two latter equations 


“ ee ee nl * 
Unzy, 6? = Uy. 0? — Way. 0? = i Ves {l+7r(q—1)} s’s*...... (29). 
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(c) The Product Moment, Inty,0%> Of py and.o”®. 


For o” we obtain, from the formulae (4), (3) and (12), which concern o?, by 
substituting # for y and putting g equal to 1: 


/9 1 1 9 2 
alia > (a2) — ra S G@ity) eee (30), 
5B tl 2, 
seer ACCESS (31), 
and o'g2= Ait) S4) a teasous este? eee (32). 
nr 


By multipheation of (21) and (30) we get 
niq Iz, .0° = (n-1P = (141) > (a?) + 28 (a4) S (a2) + Ye> 
for which the mean value by application of (27) is found to be 
n? ay . 2 = (mn? — 1) rps'3s, 
From (22) and (31) follows 
Ww MN, Lor (Nl) cas: 
so that 


= z 7;  2(n—1 
Ip y.0" => Ty .o8 — We Co = ues ) TpSas Sileiclexadefeters eVeletatere (33). 


ie 


(d) The Product Moment, I1q2,92, of o? and ao”. 


For the product o?o” we find by multiplying (4) and (30) : 
niq?o*a” = (ng — 1) (n —1) © (a?)  (y") — 2 (n — 1) & (a2) E (yrye) 
+ 4S (x22) S (Yi Yo) + Ys. 

The mean value of y,; is zero, and therefore by taking the mean value and 

using (27) we get 
nq oo" =(n—1) {ng — 1 + 2qr,? — (q — 1) 7} 88°, 
and when from this is subtracted 
nq oa? =(n—1) {ng—1—(¢—1) 7} 828°, 

we arrive at 


(e) The Standard Deviation of the Parental Correlation Coefficient. 
For the logarithm of the parental correlation coefficient calculated from our 


sample we have 
log p, = log I, — Flog o? — slog o”. 

For great values of , which allow us to treat the deviations of o”, o? and IL,, 
from their mean values as differentials, it follows from the above equation by 


differentiation that 


Ce a 


pp ce OTL ny ey do° 6a? (35). 


Pp ily oO 
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employed, which excludes the determination of terms 


With the accuracy here 


containing the higher power of - , we hay 
WW 


From (35) we find by squaring and taking the mean value for a great number 


of samples 
2 yp) f Cniiey a see — Unie Urteye' 20° o20'2) 
o p =p P| +7 1 ae =a co a > 
54 (,,)? LG ae ae a es a” 267) 
32), (83) and 


which by introducing the values from m 
he higher power of | — ” leads to 


(34) and neglecting the term containing 
las , Ty? 
a agi tha- D+ gn) + ang tt G- Dy + 2, 
21s . Dy 2 pt 
| poe eS p- D 
ng a eee n = ih 
. 5 ea eh 24 Cy eee 
or oor The +(q¢—1)1 =a [g+3+ (¢q-lr(4—r)] + qr". 
which may be written 
OMS (I Ty) ie a ae if ‘ 
oka Pate ng wae r) 1 Vp oi eR a (36) 
From this, for q > 1, 


The first term is the usual expression obtained for g = 1 
one must subtract a term which for given values of r and r, increases with q 


(f) The Standard Deviation of the Slopes of the Regression Curves 
We shail finally add the formulae of the s.p. of the slope of the regression curves 
ial ready. The regression 


for the calculation of which we have all the material ready 


coefficients are determined by 
Il, i, 
a, = —" and a,=—. 
oO (oe 


By differentiation, squaring and taking mean value, we find 
it 2 tae 


2 ie ‘| O Mey 
Only ae OO) \ip eee —— 
(Iliy)? — (°)? Mery 
and a corresponding equation for o%,, 
From these we find by pee the s.D. and product moments 


{l+r(g-l—arst* 


P ngs” 
2 s° 2 12 : 
and ahead eri +7r(q—1)—9re+2(¢-—l))re Ud —-r) 


* Vide K. Smith, l.c., pp. 6, 7, where the same formula is deduced in a different form, containing 
a 
_is neglected. 


q instead of +, The two expressions are easily seen to be identical when the term 
2 
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(g) Numerical Evaluation of the Formula for the s.p. of a Parental 
Correlation Coefficient. 


We shall first examine how valuable a material consisting of n groups of q 
siblings with corresponding parental values is compared with nq pairs of values from 
different families. Denoting the s.D.’s of p, calculated from the two materials by 
and o,,, we find by applying (36) 


od Pp 


ies 1" py ae (i= Te 


Vq 


Ty pp q(1—r,?—-(q-1) (1-7) i! — 1, a 

This ratio indicates the value of an observed pair, when the parental value also 
occurs combined with (¢ — 1) other offspring values, in proportion to the value of an 
observed pair when the parental value only occurs once in the calculation. 


The numerical values of (387) are, for values of 7, and 1, fairly well representative 
of the values met with in investigations of inheritance given in Table IV. 


TABLE IV. 
2 
Uq =O Pp * F Gly: 

ino Tp = 4 iA NO 
q rad rad 1=°6 
1 1:000 1-000 1-000 
2 735 “698 666 
3 581 536 499 
4 481 *435 399 
5 410 366 332 
6 357 316 285 
7 316 ‘278 249 
8 284 "248 221 
9 258 "224 199 
10 236 “204 181 


It appears that entering into the same parental correlation table families with 
numbers of offspring varying from, for example, 1 to 5 the same weight is given 
to pairs of observations which according to Table IV ought to vary in weight from 
1 to §. 

It is therefore a more rational proceeding to sort the families according to the 
number of offspring and deal with each group separately. The work may then be 
shortened by calculating the correlation coefficient between the parental value 
and the mean for the offspring from which the parental correlation for individuals is 


obtained by multiplying with =. o, being as above (see I(g)) the s.D. for means 


of fraternities of q individuals. It is then possible to calculate the correlation 
coefficient with s.p. for each group of families and finally calculate a mean value 
for the correlation coefficient. 


——  — oe 
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In investigations of inheritance with animals with numerous offspring it is as 
a rule easier to provide information of a given number of individuals among 
a small number of families than to examine the same number of individuals if 
they belong to a Jarger number of families. The labour required is therefore not 
proportional to the number of individuals and it must be estimated for the 
individual materials whether the encumbrance of dealing with a relatively large 
number of families is duly compensated for by the reduction of the number of 
individuals hereby permissible. 


It does not seem at the outset probable, but it may be possible, that, even in 
cases in which parent and offspring are equally easily available for investigation, 
a shortening of labour, that is, a diminution of the total number of observed 
individuals, may be obtainable by examining several offspring individuals of each 
family. We will therefore examine for which value of g, o%,, is a minimum 


when x(q +1) is put equal to a constant &. We find the condition 


a 1 pee 
(-rr- (1+ 4) =r 2} =o 


from which follows 


(1—r) i! — 1 ma 
ee ea es 
d =a fo) = ‘al = r) |! —T," oe 


2 


¢ 


To obtain a survey we introduce a few sets of values for 7, and r for which we 
give the result in Table V. 
TABLE V. 


"p U q 
0°20 0°25 1°8 
0°30 0°40 lies 
0°50 0°60 1:0 


It will be seen, that for sufficiently small values of r and r, it is profitable to 
examine several siblings of each family in those cases where the examination of 
an offspring individual requires the same labour as that of a parent. 


As a guide for the choice of the number of offspring in the more frequently 
occurring case when it is easier to provide data of offspring than of parent, we 
give in Table VI for some values of 7, and r the number of observations which, 
for varying values of gq, yield the same accuracy in the parental correlation 
coefficient as 1000 parents with 1000 offspring. 


It appears from the table that while the number of offspring increases evenly 
with increasing g the number of parents decreases more and more slowly, so that 
the compensation obtained in this way for the increased total number of offspring 


2-2 
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tends to be very small for increasing g. Already by increasing q from 5 to 6 we 
find, for 7,="3 and r="4, that to outweigh the augmentation of 360 in the 
number of offspring, we only get a diminution of 21 in the number of parents. 


TABLE VI. 


Number of Parental and Offspring Individuals which for varying q 
yield the same Accuracy to py. 


| T= The Too) 
r='4 | T=") r='6 
OP “0288 Tp, = '0266 Cope 0237 
Number of Number of Number of Number of Number of Number of 
q Parents Offspring Parents Offspring Parents Offspring 
] 1000 1000 1000 1000 1000 1000 
2 680 1360 717 14338 751 1502 
3 573 1720 622 1866 668 2004 
4 520 2081 575 2299 627 2507 
5) 488 2441 546 2732 602 3009 
6 467 2801 528 3166 585 3511 
7 452 3161 514 3599 573 4013 
8 440 3522 | 504 4032 565 4516 
9 431 3882 496 4465 558 5018 
10 424 4249 490 4898 552 5520 


For fraternal correlation we have found (see Table II) that the most profitable 
number of offspring was 3—4 for the values of 7 now considered, and that a 
somewhat greater number was not substantially opposed to economy of work. 
Whether the number ought to be increased beyond 3—4 or confined to even 
fewer offspring individuals from each family depends in each investigation upon 
the relative difficulty of observing parents and offspring. 


(h) Application of the Formula to previous Calculations of Correlation. 


P.E. (7%)) = 


For the investigation of Zoarces viviparus mentioned in the previous section, 
according to (36) the following formula for the probable error of the maternal 
correlation coefficient : 

1 

3—7r])2 

a -y¥ aorea le 
Vn ( e 2 | 
well as their probable errors calculated from this formula. Giving each of these 
values of 7, its due weight we have calculated a mean value and its probable 


in which 10 offspring individuals were examined for each mother, we have 
067449 
In Table VII are found the values of r, for the three characters examined as 
error. 
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TABLE VII. 


Maternal Correlation. 


Vert. Pd. Pigm. 
Year when 
sample aie 
taken TPE. Ty P.E. Tp tP.E 
1914 0°3513 + 0343 02409 + 0332 
1915 0°4375 +0281 0°3215 + 0303 O- 3762+ +0381 
1916 0°4139 + 0355 0°2116 +0387 03622 + 0373 
1917 0°3775 + 0298 02824 + 0293 0°3722 + 0332 
1918 0°4382 + 0298 02928 + ‘0298 0°3710 + 0308 
1919 0°3674 + 0378 071851 +0387 0°3380 + 0398 
From total) 6.49214-0131 | 0:2654+:0133 | 0°3654+-0158 
samples 


It appears that these probable errors agree extremely well with those originally 
calculated* on the basis of the 5 or 6 values of the correlation coefficient obtained 
from 5 or 6 samples. 


Summary. 


In the first section we dealt with fraternal correlation and a formula was deduced 
for the standard deviation of the fraternal correlation coefficient for the case when 
the material of observation consists of equal numbers of offspring from each family 
and when each available pair of siblings is introduced into the calculation. The 
formula is calculated on the supposition of normal distribution and normal fraternal 
correlation. 


It is shewn by means of the formula that forming fraternal correlation tables 
for fraternities of different numbers and giving each pair of observations the same 
weight we disturb very highly the distribution of weight which the observations 
must claim according to their nature. We find further from the formula that 
when the number of observed offspring from each family may be freely chosen 
the best determination of fraternal correlation from a given number of observations 
is obtained by taking (1 + 5) offspring individuals from each family (7 = frater. 

Fe 
corr. coeft.). 

In the second section we deduce, also supposing normal distribution and 
normal correlation, the s.D. of the parental correlation coefficient calculated from a 
material comprising equal numbers of offspring from each family. The formula 
shews that forming parental correlation tables of a material consisting of families 


of different sizes we also in an unfortunate manner disturb the due distribution 
of weight among the pairs of observation. It is shewn that if observations of 


* Vide L.c., p. 24, Table 6. 
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parents are as easily produced as those of offspring it is, for determination of 
parental correlation, only for small values of the corr. coeffs., for instance r,< 4 
and 7 < 4, profitable to include more than one offspring individual from each 
family in the calculation. For the case more frequently occurring, when the 
observation of parents represents more labour or greater cost than that of offspring, 
we have for certain values of 7, and r and varying sizes of fraternities calculated 
such numbers of parents and of offspring which yield the same accuracy to the 
parental correlation as 1000 parents with corresponding 1000 offspring. Table VI 
shews that when the number of siblings exceeds 4—5, there is not much gained 
by increasing it. 


Considering both fraternal and parental correlation we may therefore generally 
conclude that an essential increase in the number of offspring beyond 1 + —, ie. in 
r 


practice 3 


4, is only then to be recommended, when it causes a relatively in- 
significant increase in labour. 

This research has been occasioned by the investigations of inheritance carried 
out by the Carlsberg Laboratorium Kobenhayn and I am much indebted to 
Dr. Johs. Schmidt for the interest he has taken in my work. 
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I. INTRODUCTION. 


Starting from Bessel’s discovery, in the early part of the last century, of the 
existence of a definite relative personal equation for two observers recording 
transits by the eye and ear method, there has been a continuous discussion among 
astronomers on the errors which such personal equations. may introduce, and on 
the methods of eliminating them or correcting for them*. In such discussions 
it has been the usual practice to take the yearly mean personal equation, whether 
relative or absolute, of different observers and to use this mean personal equation 
as the basis of any correction to be applied to observations made in that year. 
From a comparison of the yearly means it is admitted that there may be gradual 
secular changes in personal equation, but it is found that for experienced observers 
there is usually very little variation. In text-books on Practical Astronomy brief 
mention of the subject is usually made, and the conclusion drawn is that for an 
observer in normal health, the personal equation im any one type of observation 
will remain sensibly constant for “short periods” of time; an exact definition of 
the words “short period” is not and clearly cannot be attempted}. It is further 
assumed that variations from the personal equation are due to accidental errors 
and may be taken as randomly distributed in accordance with the Gaussian Law. 
With the recent introduction of photography and mechanical methods of record, 
the interest of the astronomer in the subject has to some extent diminished, but 
there are many fields of scientific observation where the human element cannot be 
eliminated, and in the modern researches of the psychologist we find a study is 
made of problems of this type for their own interest and for the light which they 
may throw on the working of the human machine. 


One very important aspect of the problem of personal equation, and of par- 
ticular import to the astronomer, was discussed in detail in a paper entitled “On 
the Mathematical Theory of Errors of Judgment, with Special Reference to the 
Personal Equation,” published in the Phil. Trans. (Vol. 198A, p. 285). In this 
case various series of experiments were carried out simultaneously by three 
observers under identical conditions and it was found that there was a marked 
correlation between the variations in absolute personal equation of the different 
observers. This in itself was sufficient to show that the judgments of any one 
observer were not randomly distributed about his mean personal equation. The 
purpose of the present paper is to discuss the variations in Judgment of one ob- 
server, and to inquire how far the evidence of four or five experiments suggests 
that the theory of personal equation and of errors of judgment, as usually accepted, 
requires modification. 

The subject is a large one, and much beyond the scope of a single paper; but 
by making careful inquiries of this type with the help of statistical methods, it 

* For example, Monthly Notices, Vol. xu. 1880, pp. 75, 165, 302 (Discussion of Greenwich Obser- 
vations of the Moon); Monthly Notices, Vol. xutv. 1884, pp. 1 and 39 (Greenwich Observations of the 
Sun); Monthly Notices, Vol. uvir. 1897, p. 504 (General Discussion of relative personal Equations). 


+ For example, in Campbell’s Hlements of Practical Astronomy, 1899, p. 157; Young’s General 
Astronomy, Revised Edn. § 114, and Chauvenet’s Spherical and Practical Astronomy, 4th Edn. 1. p, 189. 
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may be possible to construct a more generalised theory of errors of judgment 
than that which has hitherto been adopted, and although the practical corrections 
which such a theory will impose may not be large, yet a more detailed knowledge 
of the nature of the variations and perhaps some insight into the psychological 
and physiological factors which underlie them, will give the observer a clearer 
idea of the precautions to be taken to avoid error and a greater justification for 
confidence in his results. 


II. GENERALISED THEORY OF PERSONAL EQUATION. 


Before proceeding to the reduction of the Experiments which have been carried 
out, I will consider whether it is not possible to make a very general, and yet 
simple, analysis of personal equation. Let us suppose that we have a large number, 
N, of observations, which have been made in separate groups, or at what may be 
termed separate sessions. For the astronomer, a session will be a night’s work ; 
for the physicist or psychologist, one continuous set of readings or observations. 
Any particular observation y may be designated (1) by 7, a function of the time 
when it was recorded, measured from some fixed epoch, or (2) by the number of 
the session in which it was made, and ¢, the time of record measured from the 
commencement of that session. E.g. an observation made in the pth session may 
be written either as y, or py. We will suppose that the secular change can be 
represented by the function $(7), but in addition to this change there may 
be another of a different type which may be termed the sessional change, and 
will be represented by the function /,(¢). The fundamental difference between 
a secular and sessional change is this: if there is a break of some hours or perhaps 
days between two series of observations, the sessional change of the first series 
will have no influence on the judgments of the second series, while the secular 
change will continue from series to series. The sessional change is thus peculiar 
to its own session or series of observations, although it is very possible that the 
same type of change may be repeated in session after session; it may be a change 
resulting simply from fatigue or perhaps from more complex causes. Figure 3 
(p. 46) provides a good illustration of secular and sessional changes; the centres of 
the small circles represent the mean values of twenty different series of observations, 
and it will be seen that the general tendency is for a drop in mean judgment 
from left to mght of the diagram; this is the secular change. The sessional 
changes are represented by the continuous lines drawn through the centres of 
the circles, and the slope of these lines is on the whole seen to be very constant 
throughout the twenty series. In this case the secular and sessional changes are 
acting in the same direction, but they may well act in opposite directions. 

We have thus seen that an observation y may be expressed in the form 


Y= Or) bio (b) FM sevseess Raa coe. deter ss.exn Ts): 
where Y; is the residual after the removal of secular and sessional changes. The 
duration of the session is likely to be so short compared with the period over 
which the secular change is measured, that t may be taken as practically constant 
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for any one session, and $(7,) may be described as the secular term in the ob- 
servations of the pth session. It remains therefore to consider the function /, (¢). 
Supposing that there were n observations made in a session, it would of 
course be possible to fit an (2 —1)th order parabola on which all the observations 
would lie, so that the values of Y; would all be zero, but such a curve would be 
entirely useless. If the observations are made at finite intervals so that we can 
imagine that one may be interpolated between two others, owing to the mass of 
random errors to which each judgment is subject, we should not for a moment 
expect that the interpolated error would le on, or even close to the (n —1)th. 
order parabola. A curve of far lower order would probably give a much better fit. 
If the sessional change is a sign of some physiological change of state which is 
affecting the observer’s judgment, it is natural to suppose that it can be repre- 
sented fairly closely by some simple curve—a low order parabola if not a straight 
line, or perhaps, if periodic, a sine curve. Suppose that in a practical case, a 
first or second order parabola has been fitted to the observations of a session; then 
it will be easy to test whether the residuals Y; follow a Gaussian distribution ; 
a simple practically sufficient, if not theoretically sufficient test would be to find 
whether 


5 ( zy 10, = (Yee (ii) 
in tied 5 Ys) _ : a approximately. 
aE Ome | 


But there is a further possibility; it may be found that although the relations 
(11) and (11) hold approximately, the Y;’s are not randomly distributed in time, 
and that there is in fact a correlation between the successive values of Y;, so that 


: os (oa et = eee ee 

ee cts saa remem cee Saeee rT 
he CA OG Mega) 
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for perhaps several positive integral values of i from 1 upwards. 
To emphasise the importance of the different terms in the relation 


pt = 0) (Tp) + fo (t) AP Y, clotatehe SiviererTeinierslevelerelevetoretetelotete (1) bis, 
let us take the case of an astronomer who makes a number of observations, often 
at many days’ interval. He will take a mean 
y =mean ¢(t,) + mean f, (0), 
but he must not suppose that the quantities 
pyt — ¥ = b(t,) — mean ¢$ (t,) + f, (t) — mean f, (t) + VY; 

follow a Gaussian distribution. It will be only a part of the expression that does 
so, the Y;’s, and it is possible that even these may not be true. 


Further it is clear that successive values of ,y,— 7% will not be independent ; 
correlation will arise from the inclusion of both the secular and sessional terms, 
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and perhaps too from a relationship between the successive Y;’s. There may be no 
large scale sessional change, and it may be possible to correct for a secular change 
in personal equation, but even then the mean of a small number “m ” of successive 


: ; : 1 ; wee 
observations, subject to its probable error ‘6745 a om will not be a satisfactory 
< 


approximation to the true value of the quantity observed, if these “mm” observations 
are correlated. Suppose for example that the points in Figure 14 (p. 76) represent a 
series of successive observations which have been corrected for any secular change 
in personal equation; the linear sessional change is small and has been represented 
by the continuous straight line, while the dotted straight line represents the mean 
value of the 63 observations. Yet many sets of 10 consecutive observations could 
be taken, the difference between the mean of which and that of the whole 63 
would be far greater than would be anticipated from the value of the probable error 
calculated from the expression above. This is because the observations are not 
randomly distributed in time. 

In addition to secular and sessional changes in the value of an estimation, there 
may be similar changes in the standard deviation; the judgments may become 
more erratic or less so. A sessional change giving an increase in standard deviation 
would suggest the effect of fatigue; and secular change decreasing the standard 
deviation might be the indication of increased accuracy with experience. An 
example of secular change in personal equation and standard deviation is illustrated 
in the diagram on p. 84; the details of this will be discussed more fully in the 
reduction of Experiment D, but it is here sufficient to say that the central curve 
represents the smoothed personal equation, while the distance between any point 
on this curve and either of the outer curves gives the smoothed standard deviation 
at that point or period in the series of observations. It will be seen that the 
standard deviation increases in the later observations. 

It would be out of place at this point to enter further into the details of 
variation in personal equation and correlation of judgments, but I think that 
enough has been said to indicate the general lines of enquiry. In choosing the 
experiments which will be described in the following sections, the aim has been 
to select those in which there was likely to be considerable variation in judgment, 
and where consequently the secular and sessional changes, if present, would be 
clearly recognizable and the correlation of successive judgments easy to measure. 
It was also important that the errors in measurement should be small compared 
with the variations in judgment. 


It may of course be urged that the experiments should have been carried out 
by an observer who was unaware of the lines of enquiry and therefore not liable to 
bias of any form, but this was not practicable, and in fact none of the reductions 
had been completed nor the general theory developed before all the experiments 
had been carried out, and I do not think that the observations could have been 
affected by any conscious or unconscious prejudice. 
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TI. THe EXPeEerRIMeEnts. 


The present paper is based on the reduction of the following Experiments : 
A. Estimation of the value of a Third, or Trisection Experiment. 
B. Estimation of the value of a Half, or Bisection Experiment. 
C. Estimation of Time, by counting of Ten Seconds. 
D. Estimation of Ten Seconds without intermediate counting. 


£. Some repeated measurements of fine structure in a Stellar Spectrum, 
with a Zeiss Comparator. 


The first four Experiments were carried out by the writer in accordance with a 
uniform scheme ; each Experiment was divided into 20 series of 63 observations, 
making 1260 observations in all. Only one series (or 63 observations) was done 
at a sitting to avoid as far as possible the effect of fatigue; in the case of 
Experiments A and B the sequence of the series was much broken, spreading 
over some weeks, but C and D were carried out within four consecutive days. 
The dates of the series are given with the detailed discussion of the observations 
below. 


(a) Haperiments A and B. 


Figure 1 is a copy of one of the printed forms used for these experiments ; the 
longer line was used for A; distance between inner edges of bounding marks 7°53 
inches ; the shorter line was used for B; distance between inner edges of bounding 
marks 5°94 inches. 

The lines were on the same form simply for convenience in printing, ete. and 
that not used was concealed while the observation on the other was being made ; 
a fresh line was used for each of the 1260 observations. In carrying out a series 
a pile of 63 forms was placed on a table shghtly tilted up towards the observer, 
and straight in front of him, with a good light coming from the left-hand side, the 
pencil being in his right hand. He then made a short pencil stroke across the line 
at the point which he estimated was one-third way along the line from the left- 
hand end (Experiment A), or at the point which he considered to bisect the line 
(Experiment B). He then turned the form over, face downwards at his side, and 
proceeded to deal with the next form in the same manner, continuing until the 63 
were finished *, The pencil stroke was made after a rapid eye estimate, the aim 
being to record the first impression of third or half formed upon seeing the fresh 
line, and to avoid hesitation; the average time taken in going through a series of 
63 observations was 5 minutes 40 seconds for Trisection, 5 minutes 22 seconds for 
Bisection, or 5°4 seconds and 5:1 seconds respectively between judgments. 


To avoid bias, it would have been desirable to complete all the observations of 
an experiment before commencing the measurement of any of the series, but 


* Actually in Experiments 4 and B 70 forms were marked in each series; the first 7 were to enable 
the observer to ‘‘get his eye in,” and the measures of them were not used at all in the reduction. 


Inne used for Experiment A. 


7:53 inches. 


Distance between inner edges of bounding marks. 


Line used for Experiment B. 


5:94 inches. 


Distance between inner edges of bounding marks. 
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Fig. 1. 


from considerations of time and as all the forms 
were not printed at the commencement this was 
not done. In some cases therefore a series was 
measured directly after it had been marked, and 
if the observer happened to remember that its 
estimates were considerably too large or too 
small, his judgment would almost certainly be 
influenced when marking the next succeeding 
series; the correlation of judgments within this 
second series would hardly be altered, but any 
natural secular change which had been occur- 
ring from series to series might be broken*. 

The measures of the observations were made 
with a ruler divided to fiftieth’s of an inch, so 
that readings could be taken to one hundredth 
of an inch with fair accuracy. 


(b) Haperiments C and D. 

These two experiments were carried out with 
the help of a chronograph. The instrument 
was run by clockwork, and had a paper tape on 
which records could be made independently by 
two pens worked by small electromagnets. One 
pen was put in circuit with a second’s pendulum, 
a platinum pointer at the end of which made 
contact at each swing through the vertical 
position by cutting through a bead of mercury, 
the other pen was connected with a tapping 
key. The rate of the driving clock was not 
quite uniform, and the pendulum second-marks 
on the tape were therefore necessary in reckon- 
ing the intervals between the marks made by 
the other pen, corresponding to taps of the key. 
As the estimate in both experiments was one 
of 10 seconds, it was found that except for a 
few cases in Experiment D+, the true value of 
the time interval between the taps could be 
represented with sufficient accuracy by the factor 
e/p, where, 

* See p. 49, remark in Table I, regarding Series IX 
and X. 

+ In Experiment D, some of the estimates had values 
nearer 20 seconds than 10 seconds, and here half the dis- 


tance on the tape between the nearest corresponding 20 
seconds was taken for p. 
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e was the distance measured on the tape between consecutive marks of the 
key. 

p the length on the tape of the nearest corresponding 10 seconds recorded by the 
pendulum pen. 


Had the pendulum been beating exactly one second, 10 xs seconds would 


have been the true length of the estimate; actually the period as found by com- 
parison for a long run with a watch was, 


before Experiments C and D ( 6th December) 1:020 ae 
after x 7 (16th 0) aOR aaa } 


so that the length of estimate with sufficient accuracy is 10:2 x = seconds. It is 
the factor © that will be used throughout the reductions. 


Po —.- 
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Fig. 2. Shows a small piece of tape, and the points from which the measurements were made. 


If the amplitude of the pendulum was rather small, it was sometimes notice- 
able that the intervals between the second marks were alternately longer and 
shorter; this was due either to slight deformation in the shape of the mercury 
bead or (what is really the same thing) from the centre of the bead not having 
been placed exactly under the equilibrium position of the platinum pointer. But 
in taking for measurement the even number of 10 seconds, such errors would be 
inappreciable. : 

In both experiments the beginning and end of the estimate were recorded by 
sharp taps on the key (at a and 6 respectively in Figure 2); a long drawn tap 
(c in figure) then followed to make a break before the next estimate was recorded. 
The interval between the b tap of one observation and the a tap of the following 
varied from 1 to 24 seconds. This method of record soon became quite auto- 
matic, and very few mistaps occurred. 


The measuremgnts on the tape were made from the sharp beginnings of the 
marks, which correspond to the making of the electric contact at the beginning of 
the tap on the key. : 


¢ 


In Experiment C the counting was “sotto voce,” the first tap being made on 
the count “nought,” the last on “ten”; in order that the counts might be quite 
uniform the word “sen” was used instead of the two-syllabled “seven.” The 
counting was usually done in step to a slight beat of the thumb on the key (not 
hard enough, of course, to make contact), and it was fairly easy to keep the 
attention concentrated during the counts. In Experiment D there was no counting 
and it was far harder to keep one’s mind fixed; in fact the mental effort required 
was quite noticeable, and I found that a greater mterval of rest was required 


— 
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between each series than for C. It is mainly by reference to the passing of 
external events, to changes the duration of which we can infer from previous 
experience, that we estimate any but the shortest intervals of time. In the 
counting ‘experiment, the second-intervals between each of the 10 counts which 
made up the observation were comparatively short, and the beating of the thumb 
or fingers became almost mechanical; the interval of course varied but was not 
subject to violent fluctuations. But while most people are able to estimate a 
second interval with fair accuracy, it would need very much practice to estimate a 
10 second interval, and in my case I found it quite impossible to concentrate 
attention for 10 seconds, solely on the passing of time. I soon found myself 
imagining that I saw the seconds’ hand of a watch, passing usually from the 
position where 60 is marked on the dial to the 10; but it was not another case of 
counting, for I did not note the passing of each individual second mark, only 
having a vague idea of the position of the 5 second division line. If I tried to 
think of nothing, my thoughts probably wandered on to other subjects, until 
I came up with a start, and realising that I had very little idea of how long 
before I had pressed the key to start the observation, pressed it to finish, with 
the greatest uncertainty. To keep attention fixed, it appeared that I must try to 
record the stages of the passage of 10 seconds, and this I was doing vaguely on 
the imaginary clock face, but I must say that the seconds’ hand was very re- 
fractory, at times appearing to stop or even move backwards, and was often so 
slow that I had to close the observation before it reached the 10 second mark. 


I have given the above description at some length in order to shew that there 
was an essential difference between Experiments Cand D, which is borne out by 
the figures of the reduction given later in this paper. The observer with the key 
sat In a separate room where the beats of the chronograph could not be heard. 
Experiment D was actually carried out in the week previous to C; before starting, 
a few trials at estimating 10 seconds had been made with a watch, but these were 
not repeated after the commencement. Again, some 10 second counts were made 
with a watch before starting on C, but no comparison with a watch or clock was 
made during the course of the experiment. The measuring up of C and D was 
left until both experiments were completed, so that the chance of some bias 
to the judgment, which occurred in the case of A and B was avoided. 

(c) Experiment E. 

This consists of nine series of readings made with a Zeiss Comparator at the 
Solar Physics Observatory, Cambridge, on photographic plates of the spectrum of 
Nova Aquilae III. The readings were taken in the first place in order to calculate 
the Probable Errors of the measurements of certain types of structure featuring in 
the broad emission bands, and each series consists of readings taken from 51 
consecutive settings on a particular marking, either a maximum or the edge of a 
maximum. Although the number of readings is not sufficient for any great 
weight to be attached to the results, they are, I think, of sufficient interest to be 
included. In the instrument used, the plate to be measured is fixed to a slide, 
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which is moved horizontally in a greased slot by pressure with the hand; the 
measurer looks through one eyepiece and pushes the slide until the feature on the 
plate of which he is wishing to measure the position, comes under a cross wire in 
the focus of the eyepiece; then looking through a second eyepiece at the scale 
attached to the slide, he takes the reading, the last two figures of which are read 
from a graduated wheel attached to a micrometer screw-head. In making a measure- 
ment there are therefore two adjustments : 

(1) The setting of the marking in the plate under the cross wire in the first 
eyepiece. 

(2) The shifting of two very close parallel wires by a micrometer screw in the 
second eyepiece, until a line of division on the scale appears to le exactly in the 
centre between them. 

Far the greater source of error arises from the first setting, particularly if the 
marking on the plate is not clear cut. In taking a series of measurements, the 
observer should always move the slide from the same direction—that is he should 
always push it or always pull it, until he thinks that the marking is bisected or 
“edged ” by the cross wire, and then he should stop; if he obviously overshoots 
the mark he should start again, and not hesitatingly move the slide backwards 
and forwards in search of what he thinks may be the best setting. By shifting the 
slide into position from the same direction, the measures may be all subject to a 
fairly constant personal equation due to “over push” or “under push,” “ over 
pull” or “under pull” of the slide, but this effect may be eliminated by reversing 
the plate in the instrument, making a fresh series of measures, and taking the 
mean of the two. In this particular set of readings the slide was always “ pulled” 
into its final position. 


(d) It is hoped that the results of some further experiments of a different type 
in estimating length which were kindly undertaken for me by Mr E. A. Milne of 
Trinity College, and Mr L. J. Comrie of St John’s College, Cambridge, will be 
included in a future paper. 


IV. TERMINOLOGY. 


Experiments A, B, C and D were arranged in accordance with a uniform 
scheme, each Experiment being divided into 20 “series” consisting of 63 obser- 
vations. The series will be designated by the Roman numerals I, II...XX in the 
order in which they were carried out, and the 63 observations* in a series by the 
letters 

Ui Yayinnt Geen Yess 

In dealing with each Experiment one of the first objects will be to ascertain 
whether there is any correlation between successive judgments, and the manner in 
which this correlation, if existent, falls off as the interval between the judgments 
correlated is increased. 'To obtain these coefficients of correlation it is necessary 


* The first 7 observations, see footnote, p. 28, being always disregarded. 
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to divide the observations of each series into “groups,” and thus we have the 
50 observations 


hg OFp Ys0 form Group 1 with mean d, and standard deviation o;, 
Yr ) Y3> siete Ya ” ” 2 ” ”» d, ” ” ” Oo; 
Yk» Yktis +++ Ys0+k-1 9» ” k ” ” di, ” ” ” Ok, 


Yisy N5> vee Yes ” ” 14 ” ” dy, ” ” ” O14: 


By “the correlation of successive judgments at intervals of one,” I shall under- 
stand the correlation of the 50 observations of Group 1 of a series with the 50 
corresponding observations of Group 2 of that series; this will be expressed 
as p,. Similarly “the correlation of successive judgments at intervals of 4,” or pz, 
is the correlation of the corresponding observations in Groups 1 and k + 1. In fact 
px 1S given by 

J 50 


> = hel 
> UtYtrk — Ady 
50 Ck 
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O,- Ok+1 


When these constants are to be referred to some particular series, say the 
pth, the prefix p will be placed before them, e.g. ,o,, ,o%, ppr, ete. 

A comparison of the d’s, o’s and p’s of the different series will be instructive, 
but as each of these constants has been calculated from 50 observations only, to 
obtain quantities with smaller probable errors we must combine the observations 
of the 20 series. Thus we shall obtain 


1 au 1 
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where n = 50, the number in a group, 


m = 20, the number of series, 


and & indicates summation for all 20 series. 
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Putting /=0, in (vi) we have as the square of the standard deviation 
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and finally the coefficient of correlation R, is given by 


D, and S; are the mean and standard deviation of the combined observations— 
1000 in all—of the 20 Groups k, while R, is the correlation between the 1000 
observations in the 20 Groups 1 and the corresponding 1000 observations in 
the 20 Groups k+1, where it must be remembered that owing to the break 
between each series the 50th observation in Series I is correlated with the 
(50 +)th observation in that series, and not with the kth in Series II, ete. 

It will be seen from the equations (vi) and (vill) that it is possible for R, 
to have a large value even though the coefficients of correlation of successive 
judgments for the separate series are negligible. For though © (p,o,¢%41) may be 

m 


zero for kh: >p,let us say, where p may perhaps be 3 or 4, it is clear that the co- 
efficients for the combined series, R;, will not vanish as k& increases unless 
>, (D,— d,) (Des — ters) 
ee ied 0. 
SiS. 
In fact if LZ, (and therefore R,) does not vanish for values of & for which the 
p's of the individual series vanish, this is a sign of the existence of a secular 
change running through the series; the means of the separate series differ 
significantly from the mean of the combined 1000 observations, that is to say they 
differ significantly from each other. Now it is important to obtain a measure of 
the correlation of successive judgments, when freed from this secular term. First 


I define S;/ by the relation Se 
: laa , 
SeSHA/ Hj] Dee) sn eee 
ee ve os (ox) (1x), 


(m = 20, & indicating summation for the 20 series); it is the standard deviation 


m 
of the 1000 observations in the combined Groups k after the secular change has 
been removed. Then R,’ is given by 


Ler aes: 
ee (proses) 


k= STS (x), 


this is the correlation of successive judgments freed from secular change; before 
correlating the observations we are in fact fitting the series means together, by 
subtracting .d,— D, from the observations of the Ist Group of Series I, ,d,—D, 
from the 2nd Group and so on, and again subtracting ,d;,, — D,,, from the obser- 
vations of the (& + 1)th Group of Series I, ete. 


Again it may be desirable to examine the residuals after a sessional change 
has been removed from the observations of each series, in addition to the general 
secular term. Suppose that an observation in the pth Series can be expressed in 
the form introduced on page 25 


phe = br p) Hpi Ve acne eee (1) bis, 
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where ¢(r,) represents the secular term which we take as constant for all the 
observations of the pth Series, and f,(t) gives the sessional change, then 8,” will 
be the standard deviation of the 1000 residuals in the twenty 1st Groups, S;” of 
the 1000 residuals in the twenty Ath Groups, etc., so that 


a eae 
Sy See ED i (0) sity tee oe en ee ee (x1), 


NUN m t=1 


the mean of the residuals being zero, and m = 20, n = 50 again; while the corre- 
lation of the successive See idle at intervals of &, after the removal of secular and 
sessional terms, or R,” will be given by 


1 
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ut 
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TABLE OF CONSTANTS. 


In the following table definitions are given of the most. important of the 
constants referred to in the preceding section and of others to be introduced 
in the sequel. 


1. The kth Group of the pth Series consists of the 50 observations 
BU ean baer see st oe RU ee 
As each Series consists of 63 observations, there are 14 Groups in each of the 
20 Series, 
n will often be used for 50, the number of observations in a Group, 
m a a 20 ie " Series. 
2. The crude Observations. 
(a) For the pth Series. 
d = mean of the whole 63 observations. 
pdx = mean of observations in kth Group. 
po = Standard deviation of observations in kth Group. 
ppt = coefficient of correlation between corresponding observations of Groups 1 
and & +1, 1. between py, and pYni1, p¥2 ANd pYn+2, ete. 
p%s = Standard deviation of the first forward differences of the observations in 
Group 1, ie. of 94.— nr, ps — Ys --- p¥r — pYn- 


n+l 
Dy 


a 


po =slope of the straight line y—,d, =, (¢ — ) which fits “best” the 


50 observations p41, po, +++ p¥t> +» pYn of Group 1. 
po = Standard deviation of residuals left after the ordinates of this “best” 
fitting straight line have been subtracted from the observations of Group k. 


ppk = coefficient of correlation between these residuals of Group 1 and Group 


lee ale 
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In the reduction of the results of the experiments, unless it is necessary to 
specify a particular series, the prefix p before these constants will usually be 
omitted for brevity. 

(b) For the combined 20 series. 

D = mean of the whole 1260 (= 20 x 63) observations of an experiment. 

D,=mean of the 1000 observations in the combined kth Groups of the 
20 series. 

S, = standard deviation of the 1000 observations in the combined kth Group 
of the 20 series. 

R,, = coefficient of correlation between the 1000 observations in the 1st Groups 
and the 1000 corresponding observations in the /+ 1th Groups. 

.R; = coefficient of correlation between the 1000 sth forward differences of the 
observations in the Ist Groups and the corresponding differences of the obser- 
vations in the &+ 1th Groups. 

Ss; =standard deviation of the 1600 first forward differences of the obser- 
vations in the 1st Groups. 

3. The Observations freed from the Secular Change. 

The “secular term” in the observation ,y;, considered as a member of the kth 
Group is ,d;. Thus the mean of the 1000 observations in the Ath Groups each 
freed from its secular term will be zero. 

S;,/ = standard deviation of the 1000 observations (freed from secular term) in 
the kth Groups. 

R, = coefficient of correlation between the 1000 observations in the Ist Groups 
and the 1000 corresponding observations in the & + lth Groups (all freed from 
secular term). 

4, The Observations freed from both Secular and Sessional Change. 

y =f,(é) is the curve representing the sessional change in the pth Series, 
so that f, (t) is the “sessional term” in ,y;, the tth observation in the pth Series. 

pY;=the residual left after removing the secular and sessional terms from ,y,. 

S;,” = standard deviation of the 1000 Y’s in the kth Groups. 

R, = coefficient of correlation between the 1000 Y’s in the Ist Groups and 
the corresponding 1000 Y’s in the k + 1th Groups. 

par =the part of ,Y; representing the actual estimate which the observer 
wishes to record. 

pS; = the part of , Y; representing a complex of accidental errors superimposed 
on ,a, 10 the process of record. 

G,, = standard deviation of the sessional terms in the 1000 observations of the 
kth Groups. 

Ff’, = Ist order product moment coetticient about the mean of these sessional 
terms in the lst Groups and the corresponding terms in the &/ + 1th Groups. 
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V. On MetruHops oF REDUCTION. 


(a) Variate Difference Correlation. 

It will become evident in the detailed discussion of the results of the experi- 
ments, that a considerable part of the correlation of the successive judgments 
is due to a secular change with time, occurring from series to series, and in the 
ease of the Trisections, to a sessional change as well occurring within the series ; 
I therefore propose to consider at this point how far the Variate Difference Corre- 
lation Method is applicable in this type of problem, and to do this will approach 
the matter from a slightly more general point of view than that of “Student” in 
Biometrika, Vol. x. p. 179. 

Suppose that # and yare the two variables to be correlated, with corresponding 


values 
Diane Ciena 


Yis Yay --+ Ytr vee Yo very 
and that we may express a; and y, in the form 
uay= Te, (t)+ X4, 
Ut am F,(t) ale Vi 
where F,(¢) and F,(t) are polynomials of degree n in ¢, the unit of ¢ being the 
interval of time or space between the successive values of the variates, which is 
supposed equal and constant ; X; and Y; are independent of the secular or sessional 
change represented by F, and F’. 
Let us now obtain a general expression for 
(1) tN Ay OF rf, the correlation of the nth forward differences of a, and y. 


if 7 Me 
(2) ING A, Y, or nie 39 ” ” Xt ” } tse 


Now 


n! 
AnX; = ul a e)" Tr+t = Untt — Nen+t-1 +++ e 1) 


. si(n—s)! Tnt+t—g ove (= 1 Ne Les oCXAdi); 
where the operator ¢ is defined by ¢*a; = a_5, ete. 
Further we must assume that 


Vv 
(a) > a4, = constant for all values of 4 small compared with 2, 


t=1 
= 0, by suitable choice of origin, 
e 
oS Yt+h a 0, 
=1 
v v 
from which it follows that > A,a%4,=0= 2 An Yess, 
t=1 f=1 


v 
(b) & (#42) = constant = vo,’ for all values of / small compared with », 


v 
> (Ytn) i voy ” ” » ” » 
t=1 
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v 


(c) & (@pn@t+n4n) = UX epee’ for all values of h small compared with 2, 


Qo 


SS ; = 2 

=! (Yt+n Yttn+k) =) x yPkoy ” ” ” ” 
t=] 

Vv 

> (Lesnge Yt+n) = UE xyPk OyOy ” ” ” ” 
t=1 


Similar relations will hold for the residuals XY and Y. 
Then a little consideration shews that the sum of the coefficients of the 
products of the w’s and y’s whose indices differ by p in the expression 
An& Any or (1 = €)” Cnt (he e)” Yn+t 


is the coefficient of v Xx gy p,»oxe, Mm the product moment 


Ant. Anye; call this coefficient a,. 


Now e” operating on «,,, gives ess 
€” ” ” Un+t ” Yn+t—1" ‘ 
and if (n+t—r)—(n+t—r’)=p, then *’—-r=p; hence a, is the sum of the 
coefficients of the products ¢,”¢” im the expansion of (1— «,)"(1—«)” for which 
r—7 = p, or the coefficient of €? in 


(1 se )" (1— ey, 


or of e"*? in (—1)"(1 — €)*”, 

; 2n! 
Gp) "py 

Hence finally writing 7 = + p we have 


il v 2n ; 
-~ > Anz Anyt = Ox0y > (eS 1)" z 
=1 j=0 


so that a, =(— 1) 


2n! = 
(Qn—j)!j! ayPj—n rreeeeees (xiv), 
where negative values of the subscript of p imply that the subscript of « is less 
than that of y; e.g. 2yp—p 18 the correlation between a and y;4p. 


Ut 


Similarly for the standard deviations of the nth differences 


lie 4 eat 2n! 

2 (Ana: = (or 2. (- Lev Q@n—p!5! BPJ—N  sereeeeeceee (xv), 
= He 2 14! 

ee Qn 2n! 

= 2 (And =o Daa ee SESTY] Teeonsnbo060088 i 

pa (Any)? = oy ban ) (Qn —j) 17! yPj-n (xvi), 


and for the correlation between the differences 


2n 
>; (— Dye 
j=0 


nk = = a ae [SSE 7 oa 
al P n! 2 4 2n! ' 
2 el ee ea eee 
WARK (Qn = ji ji Py 13, wy Qa =pyigi vin 
The correlation of the nth forward differences of the residuals X; and Y; or ,,R’ 
will equal an exactly similar expression to the last, in which yyp, xp and yp are 


2n! 
(Qn —j) 17! xyPj—n 


. 


Econ S. PEArRson 39 


substituted for ,,p,p and ,p. But as F,(t) and F(t) are polynomials of degree 


n in t, we know that 
A/V Gat constant) 


AnYyt = An Yi+ constant) ’ 
and therefore 
nk = LON tobe ING INDE Ne VG nk’, 
that is to say we may equate ,f to an expression similar to that on the right hand 
side of (xvii) above, except that the correlation coefficients of the residuals, namely: 
xyp, xp and yp are to be substituted for ,,0, ,p and ,p. 

Now in the usual problem to which the Variate Difference Method is applied 
it 1s assumed that after taking a sufficient number of differences we shall approach 
a state in which the corresponding values of X, and Y;, the residuals left after 
the ordinates of an nth order parabola have been subtracted from a, and y;, are 
mutually at random in time or space; or that 


XYPp = 0, XPp = 0, YPp = 0, 
for all values of p other than zero, and that 

xPo=1= ypo, XV Ome Xaveo 
ie. the correlation between X, and Y;. Upon this assumption it follows at once 
from the modified form of (xvii) that 


nf = xypo OF Ur yen FY 
the fundamental relation of the original Variate Difference Correlation Method. 
Let us now turn to the particular type of problem in which we wish to corre- 

late the successive values of the same variate. If we are correlating the values at 
intervals of k, we shall have as corresponding variables, not a, and y, but 
yz, and Y444 80 that 

ayPj—-n becomes pj+z4—n ANA yypj—n May be written pee 

xPj-n »» Pj-—n » NXPj—n ” » Dian” 


iene Pine os Pew Pe eae 
where as in the notation of page 35 p, is the correlation of successive values of 
the variate at intervals of p, and p,™ the correlation of successive residuals 
(at intervals of p) which are left after the subtraction of the ordinates of an 
nth order parabola representing the secular change. Hence we have from equation 
(xvi) that ,,R;, or the correlation between the nth forward differences of yt and Yerx 
is given by 


a ants 2n! 
Pere 8 a (In —syipt Pet 
k,=> ee XV1l1) 
ney on eeaaree ee G f 
Sy ee 
j= (Qn —)1gt Pi 
2M : 2n! 
> (= Lys — SF prrj—_n™ 
Ec Cra: = 
—/ z — a Pe a co (xix), 
— rape ie J hese —7 i 
a) (Qn—jyigi” é 
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where negative values of the subscript of p and p” are to be treated as positive : 
ec k= lh w—p,)— 1) then pry — p=. 08 

We are again supposing that this secular change can be represented by y = f(t), 
a polynomial of degree n in t, but we cannot expect that after removing a parabola 
of even 5th or 6th order*, the residuals Y,, Y2,... Y;,... Y, will be mutually at 
random in time or space; if we anticipate correlation between Y, and Yi42, 
we must also be prepared for correlation between Y; and Y;,,_,, and in any case 
the correlation between Y; and Y; or p44j;-n. where 7=n—k, will be unity. 
Hence we cannot make the assumptions of the first problem (that yp, = 0, etc.), 
in fact 

TAY An Yeap is not equal to gan 


Now consider the use which may be made of equations (xviii) and (xix). If 
the values of the p,’s have been calculated from the crude values of the variate, 
the quickest method of finding the correlations of differences , A, is not by direct 
calculation but by putting these known values of the p,’s into the right hand side 
of (xvii). Then using (xix) we have a number of equations connecting the 
pp” ’s, and the question that at once arises is whether there are sufficient 
equations to determine these coefficients ? It will be seen at once that there 
cannot be; if we are proceeding to nth differences, we can obtain g equations by 
putting k=1, 2,...g, but these will contain coefficients p,, to pny”; in fact 
m% more equations are required. By using the appropriate equations for the 
Product Moments and for the Standard Deviation of nth differences corresponding 
to (xiv), (xv) and (xvi) we could obtain one further equation, but at the same 
time we introduce one further unknown, the standard deviation of the residuals. 


That these equations will be indeterminate, can be seen from another stand- 
point; the nth difference correlation equations (xvi) and (xix) will be satisfied 
not only by the p,’s and p,"’s as defined above, but by the correlation of the 
residuals left after the ordinates of a parabola of any order less than n, have been 
subtracted from the crude observations. Nor can further equations for the 
py” ’s, be obtained by proceeding to n+1, or higher differences; the further 
relations obtained will not be independent, for example 
ny =— ze Bes ntt, 

a qd oF nt) 


The possible application of these difference correlation equations is considered 
in the next section. 


ete. 


(b) The Application of the Results of the preceding Section. 


Although the correlation of differences does not appear to provide a general 
method for obtaining the correlation of successive values of a variate after secular 
changes have been removed, the equations (xviii) and (xix) will be found of con- 
siderable assistance in certain cases. 


* The figures will probably not warrant the taking of differences of much higher orders than 
5th or 6th. 


Econ S. PEARSON 41 


The results of the analysis given in the three illustrative problems below will 
be used in obtaining the values of various constants in the reduction of the 
experiments in the later sections. It seemed desirable to collect the algebra 
together in this way, but in reading this paper the reader may find it more 
convenient to pass on and refer back to the theory when occasion arises for the 
numerical application of the results. 


Problem 1. In this and the following illustrations of the method of the 
preceding section, the notation of Section IV for the correlation of judgment will 
be used. 

I shall suppose that we have m series of observations through the course of 
which there is some form of secular change; the means of the different series, or 
the values of ,d, varying considerably. The coefficients of correlation for the 
combined series, R,, R,,... Ry... Rs; have been calculated, and also the single 
coefficient R,’, the correlation of the successive values of the observations (at 
intervals of 1) after the series means have been fitted together—te. after removal 
of secular change. 

It is clear that A,y, = A,V/, where y= d,+ Y;’, within any one series, and 

n n 
>, Pa (Avy : Ai Yt+k) => 2 (A, Vie Ae Arse) etec., 
m t= ae 


where again stands for summation for the m series, so that the Ist difference 
m 


correlation equations (xvii1) and (xix) are applicable, and become 


—1+2R,-—R, — Ry. + 2Ry — Ren 4 
eo =e b= 2, tos—1 ...(xx 
oo 2 '2@>R) aft 2(1—R,) ey (=) 

—14+2R/-R,/ ——Ri.+2R/— Ries ; 

= See Bee = 2 to s— oe ( XXI1). 
2 (1 7 = 31 %/) k , to s—1 ...(xx1) 


From (xx) we get the values of ,R;,/=1,2...s—1, and using these and value 
of R,’ already supposed to be known, the s— 1 equations (xx1) will give the s —1 
unknowns R,,... Ry’. 

The accuracy of this method will of course depend on the errors involved in 
the assumptions (a), (b), and (c) of page 37 above. 

Problem 2. To obtain the coefficients of correlation of the successive residuals 
left after the ordinates of the “best” fitting straight lines have been subtracted 
from each of m series of observations, that is, after the removal of a linear sessional 
change as well as a secular change. In the notation of p. 35 these coefficients 
may therefore be written 

Rae eeee Botan 

In the first place let us obtain the constants of the straight line “ best ” fitting 
the 50 observations of Group 1 of a series; this can be done by the method of 
Least Squares. 

If for any series the equation to the line is 

yada (e-"5") 


“ 


(==o0ras before)” ..5..-...0-. (xxil), 
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where the ¢th observation is 


na d+b(t-" 4 ‘)4¥, 


2 
n 
we have that K=> Y? 
t=1 
= {x —d—b (« - SS - is to be a minimum, 
t=1 
ok ok 
srefor emer 2 ‘ eae 
therefore ad 0 and ab 0, 
or > Y,=0 whence S Y= nd, 
t=1 t=1 
| q 1 1 
and 8 fy —d—0(t— "Ss y=" 3 )=0 
t=1 _ 
ae z n+1 2 
giving Dy i (1 — " )t =b > (®-(n+Dt+i@s 1). 
t=1 t=1 


Or, the first order product moment coefficient about the mean of y, and t 


(n?— 1) 
os 


Pu = 


’ 


giving for the constants of the best fitting line 
d=d,=" ZY 
12° 4s 
=e (—1) Pu: 
The next step is to obtain the correlation of the successive residuals left after 
the ordinates of this hne have been subtracted from the observations. 


We shall have that 
NO1Top: = > S id4b(t— -"*)+ Y ao (t+1—" 5°) + Ven} 
t=1 2 | 2 
—nd (q+ enh i) 
n 

d So + Yer) — nd? + b> \(¢ +1- ae = 1) (Yer — d) 

n+1 . wort Be! 

aC “Z-+ 1)(n- a} -0 3 ( 5 ) (¢ a ) 


> Vi Yin — nd? —d Yn — ) 


ll 


ie, 
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#21 , 
n os 1 
— 62 > So a 
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and if p,’ be the correlation of the successive residuals and o; and o,' the corre- 
sponding standard deviations in oe 1 and 2, we have finally 
oF (ve — ee {nm +1)y4+(n—1) Yay — 2nd} ... (xxii). 


PRS ee 
Pi G7 Fe. eee 5 


Similarly we have 
ass \a40(1-"4 "| it vi = net 
t=1 2 


= 203 (y») - 2nd +203 V(e- "F \n- ah -v 3 (1-75 ] t 


n b? 
= > V2+ 2bnp,, ——n(r?—-1 
aa t Pu 12 u( ) 


== wee n(n? — 1), 


whence it follows that 


2 


a? =o2— 19 Dr eameuceest ek nicer cree (xxiv). 
And again, 


No? = 5 {a+o(t+1-"3") + a ae aan) 


e—lt 
= 2d & yrs — Ind? + 2 S (e+ = FS) na} § S (i aot 
Gi faa a 
ee 9 Vie —- Y, 2 2 
+3 Vrgann ( e i — nb? — Inbd — 2. (b + d) (Yass — 4h — nd) 
; (n— 1) 
= nox? + 2bnpy + nb? — v e—(n—1)t+ A 
fet 


OR (4 +1 yy + n ss Ynsi— nd) 


=Nnoo?+ e n(n?—1) +b {(n+1) y+ (2-1) Ynys — 2nd} 


o,7 =02— a (?—1)- s {(a +1) y+ (1 — 1) Yngr— 2nd} oo. cceeseveeee (xxv). 


If the values of p,’ have been calculated by this means for each of the m series, 
we shall have for the combined series, 
= (p: a, 0%) 
ie (xxvi), 


TECMED 


Mm me 


a modified form of equation (xii). 


As we are subtracting the ordinates of a different straight line from each 
series, a modification of the first-ditterence equations may be necessary. The 


44 On the Variations in Personal Equation 


Ist order product moment coefticient, for the m combined series*, of successive 
first differences at intervals of k is given by 


1 n a nN 
P= Eee Yer) (Yerk — Yerkua) — a hee — ata Jy Hees hana 


mn n m n 
il 5 e 
2 SS (Vi = Vi — >) Yaee = Vea) 


MN m t=1 
aes ape = b anes ee =) {3 (- b ay Vin > ~Eewst)} 
NM (an n 


1 7. 9 fa f 1G 1— Yn MG i-Y N+1 
SS Wi) Vek sear) ee “fs z a 


pra 
MN mn t=1 le nm nm 


me = y (6 Nm Yau + Yen a ne {3 b l 3 his Yat + Yr oe 


n mn) don mn 
EL See sy PN 
m WN m MN 
Or finally, 
p= (Rp PR, Rea SSO On ee (xxvil), 


making the assumptions (a), (b), and (c) of p. 37, and where 


ih Oe > ~ = (0 mn Yn ste Uiceso aa Yaseen) a 3 a s Wy Yaz + Yea aie 


M m n mm mn 


2. y be is the standard deviation of the b’s. 


There will be similar corrected expressions for the standard deviations of the 
combined first differences. 


If we are justified in neglecting terms of the order of Q; + ?, we may use the 
first difference equations, 


Ry, = ao 


2(1-R fe 
4 re erg Pee ye ee (xxvil), 
— Ria aoe ke — Rey 
= - k= ae SO 
sicey Bee 


where, as in Problem 1, the known R,’s will give the ,R,;’s, and it will only be 
necessary to calculate directly the one quantity R,”, in order to obtain 


uy 4 yo 
RY Re Re 


Problem 3. In the last illustration it may happen that while Q,+0? is so 
small as to cause only a negligible error in the value of R,” found from 


ae 1 a 2k,” ae Re 
2S Ree 


iy = 


* 1b is the slope of best fitting line in the pth Sevies. 
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the cumulative effect of this error may be considerable in the value found for 
R,” (s = 12, say). If then we take second differences 


1 W 
oP, = ne Dz (Ye — Wer + Yee) (Yere — 2Yeresa + Yerere) 
‘ thn | 


a 1 {s Y= Yo — Yn t+ tat : Yer — Yero — Yetnt tHeins| 


me Len n wh n 
BS iC a Y 
=— > & (VY: — 2V i + Vere) Veen — 20 trea + Virus) 
MN m t=1 
= 1 fy WG = Y, ee ere at Vue) J View — Vite ee Viera + Vien 
ne ‘le n ms n 


= (Ry_2” — 4Ry” + GR,” — 4Ryy.” + Rayo”) 8”, 
and is independent of the differing values of the b’s. 
The appropriate equations are in fact of type, 
(Rye POS HR 4 OR” — ARG” FRA." 
a 2(8 —4R,7 = Ry) 
for k=1, 2,3...s—2, where R_,”=R,” etc. and R,’=1. Then using the known 
value of R,’, and that of R,”, found as in Problem 2 from the first difference 
equation, these s — 2 equations will give the s— 2 unknowns R,”... R,”. 


eT ate (xxix), 


It is clear that similar methods could be applied in the case of sessional changes 
of higher order, but I have taken the algebra in these three Problems, as the 
results will be used in the reduction of the experiments later on. The general 
explanation and equations may have appeared long, but the actual calculation in 
any particular case of such quantities as ,R,, R,,...,R,, or .R,,....R,, and then of 
R,’,...R,, and R,”,... Ry”, is exceedingly simple, and far shorter than a direct 
calculation from the crude figures would be. In two cases the correlations were 
calculated both by the difference correlation method and directly without approxi- 
mation, and the agreement of the former results with the latter established con- 
fidence in this method of approximation. 


VI. EXPERIMENT A (TRISECTION). REDUCTION OF OBSERVATIONS. 

(a) The indindual Series. 

The observations of this Experiment have been reduced in more detail than in 
the other cases; the values of pz, k =1, 2,... 13, were found separately for each 
series, and these and the values of d and o—the means and standard deviations 
of the Groups—are given in Tables I, Il and III. Several points of interest will be 
noted ; in the first place the observations have a marked tendency to decrease (i.e. 
for the estimate of a third to become smaller) both in the course of a series (as is 
seen by the general decrease of d; as k& increases) and also in passing from the 
earlier to the later series. These are examples of what have been termed Sessional 
and Secular Changes. These changes are illustrated in Figure 3 where the centres 
of the circles give the values of d, for each Series, the length of the dotted lines 
from either side of these points representing the standard deviations c,, and the 
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continuous lines through the points representing the “best” fitting straight lines 
for the 50 observations of Group 1; the slopes of these last lines, or constants ,b, 
have been calculated by the Least Square method as in Problem 2, p. 41, and their 
values are given in the 3rd column of Table IV. 


Another way of examining the sessional change, and of obtaining a typical 
representation of it, is to calculate the average values’ for the 20 series of y; the 
tth observation in a series; thus 

1 1 j 
jr=—iMR=—TAt+Y, 
Tn et im a aes 


me 


where ,d@ stands for the mean of the pth series (63 observations) as opposed to pdx, 
the mean of a particular Group & of that series. 

The values of Y; represent the sessional variation in any series about the 
mean of that series or session of observations, and the sequence aj, — D, t=1, 
2, 3,... 63, will clearly represent the mean sessional change. The values of 7% are 
given at the end of Table II and have been plotted in Figure 4, where they have 
been fitted with the second order parabola (calculated by least squares) 


y = "486 + 00255¢ — 00001892 ........ cece wees (Xxx). 
ORDER OF OBSERVATION IN INDIVIDUAL SERIES 
1 Ss 10 Ss 20 25 30 35 40 45 50 55 60 63 


“""MEAN’ OF OBSERVED THIRDS |" ' 


ACTUAL THIRD 


ESTIMATE OF 3 OF LENGTH (MEAN OF 20 SERIES) 


TRISECTION EXPERIMENT. MEAN SESSIONAL CHANGE. 
Fig, 4. 
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Figures 3 and 4 together show very clearly the marked sessional change; while 
the former shows that except in a few series, notably Series I, IV and X, the 
regression 1s remarkably constant in its value, the latter indicates that the sessional 
change is better represented by a parabola than by a straight line. 

The sessional change can also be represented numerically with the help of the 
correlation ratio of y upon ¢. If we are dealing with the observations freed from 
the secular change, that is after the removal of the means pl from the 63 observa- 
tions of the pth series we have ny, given by 


> (Ye — Dy | 63 
5 Sal ae. 1/y ~ F\9o 
——_—“—-., where S?= a ¥ (2 —,d B. 
Tit SG38" SF 260 pai ae oe 


or S’ is the standard deviation of the whole 1260 observations after the removal 
of the secular term*. Then the ratio of the mean square distance of every observa- 
tion from the regression line or line of means 4, to the standard deviation of the 


1 68 € 
/ a0 > (ye Ho 


observations is 


Ss’ 


where & indicates summation for the 20 series. 
m 


This is a measure of the closeness of fit of the observations in a series to the 
mean sessional change as represented by the values 7; the larger ny, and therefore 
the smaller V1 — 7,2 is, the more nearly does a sessional change of the same form 
recur in series after series. A comparison of the values of V1 — Nu? for the different 
experiments will show the relative significance of their mean sessional changes. 


In the present case the value of 7,, 1s found to be 579 + ‘013, while 
V1 = ny 2 ='815. 


It would be an interesting problem to obtain the correlation of the successive 
residuals left after the ordinates of the “best” fitting parabola for each series had 
been subtracted from the observations of that series; but although this has not 
been done, a fair idea of the degree to which the correlation of the successive 
judgments in the individual series is due to the sessional change can be obtained 
by removing the “best” fitting straight lines from each series. The values calcu- 
lated for the ,b’s have been referred to above, and using these and the equations 
(xxu)—(xxiv) of pp. 41—43, the values of 0,’ and p,’, or the standard deviations 
and correlations of successive observations freed from the linear sessional changes, 
have been calculated and are given in the 4th and 6th columns of Table IV. The 
pis are all less than the corresponding p,’s, except in Series X where they are 


* Actually it is only the values of the Group Standard Deviations Sj’, Sy’... S14’ which have been 
calculated; they are not all equal (as shown in Table V) owing to the sessional change in standard 
deviation, but an approximation to S’ sufficiently accurate for the purpose will be given by taking 


S2= 1, {81/24 Gy2+... + Sya2}. 
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equal, but though there is in general a considerable reduction, it is clear that 
neither a linear sessional change nor a parabolic one of the form represented by 
equation (xxx) account for the greater part of the correlation of successive judg- 
ments. 

The coefficients p, and also p,’ vary considerably from series to series, but there 
is no very marked progressive secular change. On the whole both p, and p,’ are 
large when the standard deviation is large, and a measure of this correspondence 
will be given by the correlation of p and c. This can be obtained most readily, and 
with sufficient accuracy for the purpose, from the correlation of the ranks of these 
variates, by the method referred to in Biometrika, Vol. x. p. 416*. 

The results are 

correlation between p, and o, + 52+ °‘11=7,,, 

7 . pi and o, + 60 + °10=7’,.. 
The difference is not significant, and we may draw the conclusion which could not 
have been assumed a priori, that the correlation of successive judgments is larger 
when the variations in judgment are larger, and that this relationship does not 
appear to be reduced when the large linear sessional change has been removed, 
Large values of « might have implied erratic observation and small relation between 


TABLE IV. Constants of Individual Series (T'risection). 


(The definition of these constants 1s given on p. 35.) 


iL 2 3 4 5 6 a 8 
Series | dy b | fee Pi oy’ Oj o§ 
I | 2°6238 | + 000673 | +°2925 | +-3008 | 06015 | 06093 | :0721 
II 7036 | — 002964 | +°4149 | 4+°5485 | 708125 | :09182 | -0873 
Ill | *6350 | — 003626 4 °3643 | +°5560 | ‘08001 | :09561 | -0901 
IV | +5114 | +°001718  —:2521 | —-0460 | °05356 | 05902 | 0854 
V =| «= °5809 | —:001555 | +2520 | +°3234 | -06853 | -07210 | -0839 
VI | 5132 | —-001529 | +°2270 | +°3390 | -05495 | -05921 | -0681 
VIL | 6448 | — -004244 | +-4918 | +6457 | 09322 | -11154 | -0939 
VIII 5314 | —:002788 | +°5478 | 4+°6089 | °11369 | -12060 | -1067 
IX | +3404 | - 004477 | +°4979 | +°7075 | -07935 | :10233 | :0783 
X | +5590 | —-000086 | +°7151 | +°7151 | *11141 | *11141 | -0841 
XI 4582 | — 000972 | +°7320 | +°7381 | 08317 | -08435 | -0610 
XII 5014 | — 002720 | +°4851 | +°6360 | °05923 | -07105 | -0606 
XUIL | =°4752 | — 003594 | +°5101 | +°6897 | -06409 | -08244 | -0649 
XIV “5000 | — 003818 | + °6433 | +°7965 | 05993 | -08141 | :0519 
XV _ | 4290 | — -005588 | +-6810 | +-8568 | 07051 | -10711 | -0573 
XVI 4390 | —-003071 | +°6408 | 4+°7412 | (06441 | 07819 | -0562 
XVII | +4254 | —-004369 | +:2569 | +-6556 | -05840 | -08594 | -0713 
XVIII 3944 | — 000580 | +°2870 | +°3144 | °04568 | -04644 | 0544 
XIX 3700 | — 003236 | +°4935 | +°7219 | (05107 | -06920 | ‘0516 
XX 2°3666 | —:001725 | +°2850 | +°5072 | -03680 | 04443 | -0441 


Mean value of b= — :002425. 


* The theory is based on the hypothesis that the variates follow a normal distribution, and though 
this may not be strictly true for the p,;’s and o,’s the method probably gives a sufficiently accurate 
approximation to the value of their correlation, 
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successive judgments, and at the same time high correlation might have been 
expected to result in small variation. The significance of this will be discussed in 
the concluding sections of this paper. 

In Table IIT giving the o’s, it will be seen that in general the standard deviations 
increase in the later groups; though this may be due in part to the parabolic form 
of the sessional change, with its tendency to an increasing drop towards the end of 
the series, it is possible that it also indicates a fatigue effect setting in, and causing 
the later observations in a series to be more erratic ; the same phenomenon appears 
in the Bisection Experiment where there is no appreciable sessional change within 
the series. It may in fact be looked on as a sessional change in the standard 
deviation. 

At the end of Table I are given the dates on which the different series were 
carried out, remarks noted at the time as to the condition of the observer, and, for 
the last 14 series, the time taken to mark off the 70 forms*. It will be seen that 
there was a large gap between the times of carrying out the first six and the last 
fourteen series, and this interval of nearly two months broke the continuity of the 
secular change in the means of the series. In Figure 5 the means of Group 1 (or 
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Fig. 5. 
the d,’s of each series) have been plotted firstly with the order of series and 
secondly with the date of series. 
If x is the personal equation, or mean value of the observations of Group 1 of a 
series measured in inches, 
y the number or order of a Series, 
z the number of days between the 6th May and the date on which the series 
was carried out, 


* Reference to the 7 trial forms first marked, in addition to the 63 of the Series proper, is made on 


p. 28 in footnote. 
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we have for the regression straight line of x on y 


2— 24976 =—-ONS5InGy Or) irene ate eee (xxxan)s 
and for the regression of a on z 
¢—2:4976 =— :00233«2 — 53:05) . -n--.eeeeee (xxxill), 


and these lines have been drawn in the diagram. 


The corresponding coefficients of correlation between (1) personal equation and 
order, (2) personal equation and time, and (8) time and order, a meaningless 
coefficient but required to find the partial correlations, are 

(1) rey = — 800 + 054, 
(2) oie 092 Os 
(3) Tyz = +°882 + 033, 
and the partial correlations are 
Toy.z = — 559 + 104, 
Voz. y = + 049 + “150. 
But the interval between the May and July series was so large, that the series 
should perhaps be considered as forming two groups, one of six and the other of 
fourteen. Taking the last fourteen series, we have the regression lines 
av — 24596 = — 01346 (y — 7:5), 
the Series VII being given the order 1, VIII, 2 etc., and 
x — 2°4596 = — 01048 (z — 6°64), 
z being the days between 10th July and date of Series. These lines have also been 
drawn on the diagrams. 


The correlations are 


(1) rey =— 674 £ 098, 


(2) ge = — 673 + -099, 

(3) Ty, = +956 + 016, 

giving partial correlations a ena al LLP 
ogee aseeolie 


The point of interest is this: there is a secular change in personal equation 
from series to series; is this change more closely related to the number of series 
or sessions that have gone before (that is, almost, to the experience gained), or is 
it due to some change with time in the observer’s outlook ? Suppose that it was 
arranged to carry out observations on a number of different days with varying 
intervals of time between them, and that on each day a certain number of different 
series of observations or sessions were undertaken at regular intervals of perhaps 
an hour or less; any series could then be classified as the pth series of the qth 
day. Then 7z,,z (the partial correlation of personal equation and order, time being 
kept constant) would give a measure of the relationship between change in personal 
equation and order of series in any one day. This will not necessarily be the same 
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as the sessional change, for it has been supposed that this latter occurs only during 
- the course of a sitting, and is broken by the interval of rest in between. On the 
other hand if we take all the pth series of the various days, then r;;,, (the partial 
correlation of personal equation and time, order being constant) gives the relation 
between change in personal equation and time, taken over a long period. 

The long break in the middle of the Trisection Experiment takes away any 
real significance from the difference between ry,,z(—°559) and 1,2, (+ 049) for the 
twenty series, and in the case of the last fourteen series these coefficients are equal 
(—°143 and —:138), because the intervals between the series were nearly uniform. 
In the Timing Experiments, (and D, the arrangement of the series in groups on 
consecutive days leads to considerably more interesting results *. 

A comparative measure of the consistency of the consecutive judgments in the 
different series, is the standard deviation of first differences, or 

nv 7 
> (Yer — Yi) 
f= ears 


C5 = 
n 


approximately. The values of this expression are given in the 8th column of 
Table IV. 

Now suppose we compare the constants in Table LV, the dates and remarks at 
the end of Table I and the diagrammatic representation of seven of the series given 
in Figure 6. The first series to be remarked on is IV; most of the series were 
carried out at the beginning of the morning before any other work, and it is possibly 
the fact that IV was done soon after a spell of measuring spectrograms with a Zeiss 
comparator that explains the exceptional values of p, and p,’, namely p, = — ‘0460, 
pi =— ‘2521. The os, or standard deviation of Ist differences, is no higher than for 
the other series done at about the same time (in May), and the oa, is lower. 
The first graph in Figure 6 gives the diagram of this series ; the rapid fluctuations 
in judgment about a very steady, if slowly changing, personal equation may perhaps 
have some physiological significance. In the second and third graphs of Figure 6 
are represented two of the four Series VI[—X which were done when the 
observer was not very fit; they have large values for o,, and the o,’s are large 
compared with those of the ten series which follow, showing that the judgments 
were rather erratic; the correlation is however high. In VIII there is a great 
jump between the 44th and 45th judgments, from 2°22 to 2°66, and the gradual 
drop down, which follows, to 2°20 (for the 52nd judgment) is a good example of 
a way in which successive judgments are correlated. In Series XI (not repre- 
sented among the graphs) there appears to be a periodic variation, for the 
correlation falls steadily from p,=+°7381 to py =— "4428. 

XIII, XV and XVI are typical highly correlated series with large sessional 
changes; the o,’s as well as the o,’s are considerably smaller than in the series 
VII—X. In examining the fourth to sixth graphs we notice what may be called 
the large scale correlated variations superimposed upon the linear sessional change ; 

* See pp. 70, 75 and 83, and below. 


ak 
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it is due to these variations that the values of p,’ remain so large, and it is their 
absence that makes the correlation in IV so low. The last graph (Series XX) is 


Estimate in inches. 


that for which the o, and os are least, and yet p, is quite high (+ °507). 


As a last instance of these points, we may compare the constants* for XV and 


Ve 


b Pi. pi o7 o5 
XV — 005588 +4 °6810 + °8568 ‘10711 -0573 
XVIII —-000580 +4:2870 +4:°3144 -04644 -0544 


2°64 SERIES IV 


46} \ FING} / 
44 a ry fl \ P,=--046 
aap b =+-00172 
r oe ee ; o,= 059 
‘36+ Order of observation in Series = Hels 
4m 10 20% 085 30 40 


SERIES VIII 


P= +-809 
b =~ -00279 


38 g= ‘121 
a o,= 107 


262 SERIES IX 


P,=+-T07 
b =--00448 
= 102 
O= + 78 


The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. 
Fig. 6. Trisection. Diagrams representing variations in judgment. 


* For definitions of these constants the table on page 35 may be referred to, 
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2°64 SERIES XIII 


4at p,=~-G90 
= +68 
gob =- 00359 


sf o- 082 ee 
Soe G- 065 


Order of observation in Series 
1 10 20 


Pp, =+ "R57 


are 
b=--00559 Ww fe : 
ee 107 


a= “057 


Estimate in inches. 
wo 
no 


A=+T4l 
b =--00307 — 


SERIES XX 


B= +07 
. Dis nae 
‘30 hE 
gost, G “044 


10 20 30 40 50 60 


The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. 
Fig. 6. Trisection. Diagrams representing variations in judgment (continued). 


XV has a large linear sessional change, but superimposed on this there must be 
considerable correlated variations, for the removal of the best fitting straight line 
only reduces p, (8568) to p,/ (6810), XVIII has variations altogether on a smaller 
scale ; the correlation of successive judgments is low and it is barely affected by the 
removal of the linear sessional change. And yet though the o, for XV is more 
than twice as great as for XVIII, the o3’s, or measures of the average Jump in 
estimation from judgment to judgment, are practically identical; the importance 
of this constant os as a measure of variability of judgment is discussed on p. 69 
below. 
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(b) The Combination of Series. 


Having discussed the reduction of the individual series, I will proceed to con- 
sider the results of combining the 20 series. The formulae (v) to (viii) on pp. 38 and 
34 give the values of D;, Sz, and R,; which are tabled below. Remembering that 
D, and S, are the mean and standard deviation of the combined observations 
Yr» Yo ++» Yu from each of the 20 series, D, and S, the mean and standard deviation 
of the combined observations yo, y3-.. Yn, and finally D,, and Si, of yrs, Yrs +++ Your WE 
see that the progressive decrease in D; as k increases indicates the shortening in 
the estimate of a third during the course of a sitting, while the increase in S;, may 
perhaps be partly due to increasing variability of judgment due to fatigue. The 
values of R, are large, but this is to be expected owing to the large changes in 
personal equation from series to series; in fact for / = 13 it will be found that the 
limiting expression J; of page 34 gives 


Dag = + 5435, while R,,=°6151. 


The reason for this difference between ,; and R,; is that &(p,;0,0,,), and therefore 


m 
R,;, does not vanish. The next step is to obtain the values of S;’ and R,’, or the 
standard deviations and correlations of successive judgments after the secular 
change has been removed. They are found from Equations (ix) and (x) of p. 34 
and are given in Table V (5th and 6th rows). 


There is here an opportunity of testing the accuracy of the Difference Correla- 
tion method discussed in Section V(b); the case is that of Problem 1, page 41, the 
values of R,, R,... R,, are known and give the correlation of Ist differences, 
R,, R,...,Ri.; these together with the coefficients of correlation of 2nd differences 
to be used later, are given at the bottom of Table V. Then using the value +6246 
for R,’, we get the values of the 12 quantities R,’... R,,’, which have been inserted 
in the 7th row of Table V. It will be seen that the values obtained by this 
approximate method agree well with the others, the differences being within the 
probable error of the R,’s up to and including R,’; beyond this the approximate 
values become rapidly too small, the error, from the form of Equations (xx) and 
(xx1), being clearly cumulative. This failure is certainly largely due to the fact that 
the errors involved in the assumptions («), (b) and (c) of p. 37 are not negligible 
when the later groups enter into the correlation, for we have already seen that 
both D;, and S; change steadily with hk. 


The values of S;’ and R,’ in Table V correspond to the average values of the 
standard deviations and correlations of successive judgments in the individual 
series, Le. of the o’s and p’s given in Tables III and I. Owing to the sessional 
change which occurs during the course of nearly all the series, Ry’ does not vanish 
as k increases, but appears to approach a limiting value in the neighbourhood of 
~+°16. By obtaining for the separate series, the coefticient of correlation, p,’, of the 
successive observations at intervals of one, freed from the linear sessional change, 
a step has been made towards the further reduction of the problem. R,’, the 
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coefficient for the combined series corresponding to p,’ of the individual series, is 
given by 
2S =x Tie) 


Ge ne = = (os")! 


and taking p,’, o,/ and o,' calculated from ete (xxii), (xxiv) and (xxv) we 
find that 


“Salat st uit ponte neece (xxv) bis, 


RR,’ = + 48922 + 01623. 


Then R,”, R,;’... R,;” can be found by the method of Problem 2, p. 41; or 
again using the value of R,”” found from the first difference equations, we may 
proceed to second differences as in Problem 3, and so obtain R,”...R,.”. In this 
particular case there is no need to use the second difference equations*, but the 
values of the R,’”’s have been worked out by both methods, as numerical examples 
of the theory of Sections V (a) and (b). A comparison of the values given in 
Table VI shows that there is no significant difference between the results of the 
two methodst, and the agreement found earlier in this section between the 
values of R,/ calculated directly and those found from the difference equations, 
warrants confidence in the results for R,”. Although the negative values of Ry,’ 
are probably too large for the higher values of & (just as the later positive values 
of R,; in Table V, row 7, were too small), there is no doubt, I think, that the 
correlation of the successive observations freed from the linear sessional change, 
does become negative at k=5,6 or 7 and remain negative for the higher values 
of k. A word of qualification is necessary; the linear sessional change to be 
removed has been represented by the line “ best” fitting the first 50 observations 
of each series, and a glance at Figure 4 shows that the mean values of the later 
observations in the series of 63 would lie well off this line because of the parabolic 
form of the sessional change; the negative values found for R,”, R,”, etc. may 
probably be largely accounted for by this fact. A more satisfactory approximation 
to the correlation of successive judgments freed from sessional change will be 
obtained in Section XI below. 


As o3=0,V2(1— pi), referred to on p. 55 above, gives the standard deviation 
of the first differences of consecutive judgments in a single series, we shall have as 
a corresponding measure for the combined twenty series 


S;=S/ V2 (1—R/). 
For the Trisection Experiment 


Ss = 0732. 


* To get an idea of the order of the terms Q;, and l? which are being neglected, the values were 
calculated for two values of k, with the result 
k=1, Q,+b?= —-000001064 
k=9, Q, += eet i 


— R2 
+ The probable errors in the Table have been calculated from the usual formula e= +°6745 == 3 


and do not cover the errors arising from the method of approximation. 


Econ S. PEARSON 61 


(c) On the possible Effect of shifting the Head during the course of a Series. 

It was suggested to me that the correlation of successive judgments in this 
and in the Bisection Experiment might be due to periodic shifting of the head 
from side to side during the course of a series, some parallax effect of the two eyes 
making corresponding variations in the estimation of a third (or a half) of the 
line on the form. Now such an explanation might account for part of the corre- 
lation in these two experiments, but it could not explain the regular secular and 
sessional changes in the Trisection, except by the highly improbable hypothesis 
that the observer’s head leant over increasingly to one side during the course of a 
sitting, and that he started with it more on one side in the later series than 
in the earlier ones. But beyond this, the fact that correlation is found also in the 
timing experiments suggests that it is of deeper and more complex origin. It is 
likely to arise from many unknown causes affecting the environment and condition 
of the observer, and if one of them is a relative shifting of the eyes, it is of interest, 
for it will enter into many kinds of observations, where the observer who takes 
the readings is not looking through a fixed eyepiece. 

To test the effect of a relative shift between head and paper, 42 of the forms 
were taken, and trisected in the usual way, but for alternate groups of seven the 
paper was shifted 4 inches relatively from side to side. The measures of the 
estimates and their means are in Table VII. The three sets of seven under the 
heading I, were made with the forms in one position, the three sets under II 
with the forms shifted 4 inches to the right. The difference is noticeable at once; 
readings I are smaller than II, and at the same time the curious effect: of sessional 
change is appearing, the later readings of I and again of II, being on the whole 
smaller than the earlier ones. Now in carrying out the observations of the 
Trisection and Bisection Series, the body and head were kept as steady as possible, 
and it is unlikely that frequent shifts as large as 4 inches could have occurred ; 
further the differences between the means of readings I and II, are much smaller 
than the actual variations in judgment shown in the diagrams of Figure 6. 

But as a further test, a series of 63 forms were marked off, with the head 
fixed mechanically; the results are given in Table VIII with the usual notation. 
The correlations are not as high as many of those in Table I, but they are com- 
parable with those of Series I, V, VI, XVIII. The sessional change is also-indicated 
by the decrease in das & increases*. Without carrying through a good number of 
series with fixed head, no useful comparison can be made, but I think that the 
evidence of this one series is sufficient to justify the assertion that a shifting of 
the head from side to side cannot account for the greater part of the correlation 
of successive judgments. 

(d) Summary. 

First considering the individual series, it was noticed that there was a secular 
change in Personal Equation with time—i.e. the means decreased in passing from 

* The value of o, or 074 may be compared with that of S; for the ordinary series of the Experi- 


ment 4, which was -0732. 
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the earlier to the later series; in addition there was a remarkably constant 
sessional change within each series, this change being again a decrease from the 
earlier to the later observations. There was something in these changes almost 
analogous to an elastic strain, during the course of a series the estimation of a 
third drops, in the interval between the series there is a recovery, but not a 
complete recovery, for the first judgments in the succeeding series start at a 
little lower level than the first, but well above the last judgments in the series 
before ; this slight “permanent deformation” caused by the “strain” represented 
in the sessional change, results in the secular fall. The figure below gives an 
ideal representation of this. 


A, By, Ag Bg, ... sessional change in Series 1, 2 ... ete. 
B,A2, By Ag,... ‘recovery’ during interval between Series 1 and 2, 2.and 3 etc. 
M, Mg the resulting secular change. 


Then combining the twenty series, in order to get more reliable results, the 
coefficients of correlation of successive judgments, R;, were obtained; owing to 
the secular and sessional changes these coefficients had very high values and as /: 
increased, apparently tended to a limit at about +°60. By fitting the means of 
the series together, the secular change was eliminated, and a serics of coefficients 
R, obtained, which represented the average value of the correlation in a series ; 
owing to the sessional change the R,;’s did not appear to tend to zero as k 
increased but to a limit at +°16 or +°15. The correlation of successive values of 
the residuals, left after subtracting the ordinates of the straight line “best” fitting 
the first 50 observations of each series from the observations of that series, gave 
a set of coefficients R,’, which fell off very rapidly and became negative when & 
equalled 6 or 7; the large negative values of the coefficients for the high values of 
k were probably due in part to the method of approximation used, and also to the 
fact that the straight line fitting the first 50 observations in a series did not 
represent satisfactorily the sessional change. 


The values of R,’ calculated (up to / =13), gave no evidence of any tendency 
to periodicity in this coefticient, although there was evidence of this occurring in 
some of the individual series; periodicity in R,’ would indicate marked variations 
of roughly the same period occurring at any rate in a large number of the series. 
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It will be shown in a later section that the values of R,’, k=1... 13, can be fitted 
very closely by a curve of the type y= p+ qr*, where p, g, and r are constants. 


Finally it was shown that the correlation of successive judgments could not be 
due to a shifting of the head during the course of observation, although this 
might perhaps be one of many contributory causes. 


EXPERIMENT A. TRISECTION. CorrELATION—INTERVAL DIAGRAM. 


CORRELATION OF SUCCESSIVE JUDGMENTS 


t= — =a I I 
3 4 5 6 
INTERVAL BETWEEN SUCCESSIVE JUDGMENTS 
Fig. 8. 


In Figure 8 the values of R,, Ry, and R,” (for linear sessional change) are 
plotted to 4; the theoretical curves of the Equations (xx) and (lvi) shown in the 
Figure will be discussed in Section XI, and also the points referred to as Ry’, 
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VII. Experiment B, Bisecrion. REDUCTION OF OBSERVATIONS. 

(a) The individual Series. 

In this Experiment the coefficients of correlation of successive judgments for 
the individual series were not all worked out, but only the values of p,; these are 
tabled with o, and d, in Table IX. The values of d, for each series, k=1... 14 
are also given in Table X. It will be seen that there are not the same marked 
secular or sessional changes as characterised the Trisection Series. In Figure 9 
the means of Groups 1 of each Series—or the d,’s 
and to “time,” and again if 


have been plotted to “order” 


« is the personal equation or mean, 
y the number of order of series, 
z the number of days between 13th June and the date on which the series 
was carried out, 
we have for the regression lines 
2 — 28793 =— ‘00101381 (y—10°5) ............... (xxxiv), 
x — 2'8793 = — ‘0005359 (z — 32°10) ............... (X&xv). 
These lines have been drawn in the diagrams; the coefficients of correlation 
are 
Tng = —'337 +134, Vey = — 156 +°147, Tye= +945 + 016, 
giving partial correlations 
oj p= 20 109, Tnz.y = — 589 + 099. 
TABLE IX. Constants of Bisection Experiments. 


: —___ | Dates (1920) and time Time taken 
Series dy O1 Pl O5=01 V2 ql _ p1) at start | for series Probable Errors ) 
= is ae | Coefficients of Corre- 

I 2°8648 | -04997 | +-4942 ‘0503 1Lamdy3 June | 6" | lation calculated from 
II 8624 | 05461 | +2609 0664 2.45 p.m.s 5 20 | x0 pairs of tf & 
Ill ‘9262 | 03821 | +-0823 ‘0518 pm. 15 ,, 5 45 | 00 pars of the vari- 
IV *8642 | 04690 | +4107 ‘0509 10 a.m.) 59 5 30 | ates. 

Vv -8290 | ‘05158 | + °5870 ‘0469 Sp.mii-” 2 i 1b. 45 

VI 9114 | 04609 | + -5768 0424 am. 30 ,, 5 45 

VII 9178 | °04415 | +2993 0523 10am.) 4 yu] | 5. 45 p 1 
VIIE | 9218 | -04766 | + -4360 ‘0506 1215 pm o> 6 0 pa 

IX "8724 | 04384 | +°1389 ‘0575 ee 6 5 30 

x ‘8990 | 04579 | +°1018 ‘0614 6.30pm. ~ ” | 6 15 80 | +:0343 

Xl -9238 | -03617 | — -0423 0522 9.30a.m.l ¢ | 6 30 ‘70 | £0486 
Sal ‘9298 | 04810 | +°5089 ‘0477 6.45 p.m.s ” | 6 45 ‘60 | +:0610 
XIII | 8806 | -04407 | + 2769 ‘0530 Aitits. Wes 5 45 ‘BO | +0715 
XIV | -8312 | 04955 | + -4445 “0522 11 ne ve Mwecat PO 20 ‘40 | +:0801 

XV 8242 | 03606 | +°3334 ‘0416 2.30 p.m. ot 56 lo 30 | +0868 
XVI ‘7976 | :04135 | +°3190 0483 pane 1G) Voie ini te fe 20 | +:0916 
XVII | -8566 | 03739 | +°5531 0353 a ae a ere a ‘10 | +:0944 
XVIII *8808 | °03497 | +2776 0420 joan INS, | 7 a | ‘00 +0945 
XIX | :8890 | -03986 | + °5407 ‘0382 pm. 19 ,, 2 

XX | 2°9030 | -02610 | + +1404 ‘0342 pias 20) = 


Mean time taken for a series of 70 observations (including the 7 preliminary trials*) 5™ 58s 
Mean interval between records of judgment 5:11 ‘ 


* See p. 28, footnote. 
Biometrika xv 5 
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The dates on which the series were carried out—the z’s—are given at the end 
of Table IX; the distribution was more satisfactory than that of the Trisections, 
and the significance of these two partial correlations will be referred to shortly. 


The variation in the means of the series is much smaller than in the case of 
the Trisections ; we have here a range from 2°93 to 2°80 ins. while in the other, 
from 2°70 to 2°34 ins.; in both cases the secular change is in the direction which 
lessens the measures, i.e. the marks on the forms in the later series were on the 
whole further to the observer’s left hand than in the earlier series. Nor does 
experience appear to increase accuracy, for the true position of the half is at 2°97 
inches (and of the third at 2°51 inches). 
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Personal Equation— Personal Equation (Mean of 1st 50 observations of 
Order of Series. series)—Time (days from 13th June). 


Fig. 9. Bisection. Means of Groups 1 of each series plotted with Order of Series and Date of Series. 
Next considering the sessional change, the values of 7; (defined on p. 47) have 
been plotted in Figure 10; the straight line “best” fitting these points is 
Yi — 2°S816 = + 0008534 (6 — 82) ....cecceceeeseveeee (xh), 
where ¢ is the order of observation in a series, and the coefficient of correlation 
between 7 and ¢ is +°5294 + ‘0137*. 
Using the relations of page 48, it is found that 
Ny, = 271 +018, V1 — 9, = 963, 
and on comparing this latter value with that for the Trisections (815) we see 
that in the present case the mean sessional change is of less significance. 


inches 
2'91 


Diagram of Mean Sessional Change 


28S, 

2°88 ing {8 Sanh ON eg en a Sg a a ce ror 

ze H vi we eee ve Mean,at 2°8816 inches 

2.86 ¥ wy —Regression 7,- 2'8816= +:0Q03534 (¢-82) 
2°85 The value of the true half is 2°97 inches 


1 10 20 30 40 50 60 
Fig. 10. Bisection. t, Order of Observation in Series. 
It will be noticed in looking at Figure 10 that the points (¢, 7%) appear to be 
subject to a fairly consistent periodic variation about the regression line, the 


* This correlation between the mean tth observation (j,) and t must be distinguished from the 
correlation between the éth observation (y,) and t, which is +°143, and as it should be, less than 7. 
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complete period covering from 20 to 22 observations. Without a detailed analysis 
of the separate series, it is not possible to say whether there is a period of this 
order underlying the variations in judgment in all series, or whether this 
periodicity in 7% results from large variations in one or two series; the diagrams 
of seven of the series, in Figure 11 do not certainly suggest any marked periodic 


Estimate in inches. 


variation, and it is possible that the drop at about the 44th and the peak near the 
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The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. 


Fig. 11. Bisections. Diagrams representing variations in judgment. 
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55th observations in Series V and VI, would go far to account for the similar 
features in the 7 diagram, the “y” scale of which is four times greater than that 
in Figure 11. 


Using the method of Correlation of Ranks*, the correlation between o, and p, 
has been calculated for the 20 Series; the result is 


Te,o, = + °420 +124. 


Another coefficient which may be calculated, is that of the correlation between 
os, or the standard deviation of first differences of consecutive judgments, and p, ; 
using the same method as for 7,,,, it is found that 


Yosp, = — 416 + 1125 and again ro5,, = + 465 + ‘118. 


Now p;, a, and os are not three independent quantities, as they are connected 


by the relation 
os = VE > Cer = Ye = 0,V2(1—p,), 


t=1 n 


and it is open to question, which two are the most fundamental. In the ordinary 
theory of the Combination of Observations, where it is assumed that p, 1s zero, 
it is natural to consider o, (or «) as a fundamental constant, the measure of the 
accuracy of judgment; os appears to have no special significance and merely 
equals V2c. If however there is a correlation of successive judgments, o loses its 
importance ; if we take a small number, p, of successive observations and calculate 
their standard deviation, s,, we can no longer say that s,, subject to its probable 
error + 6745 od oi will be equal to o, the standard deviation of a long series of 
V2p 

judgments. On the other hand there is every reason to expect that the os found 
from a few observations will give a fair approximation to the os found from a 
large number. o is dependent to a high degree on the sessional change; for 
example it has been shown + that if this change can be represented by a straight 
line of the form y= Ot, then o’, or the standard deviation of the observations freed 
from this change is given by 


12 


be 
ao? =o? —— (n?— 1). 

It is true that os is dependent to some extent on the sessional change, but far 
less so; for instance in the case of the linear sessional change, o,’, the standard 
deviation of the first differences of the successive residuals left after the removal 
of the line, is given approximately by the relation 


fo 


a3? =a; — b*. 
And for any form of sessional change which is likely to oceur in experiments 
of the type we are considering, the correction to the difference between two 
successive observations necessary to get the corresponding difference between the 


* p. 52 and footnote. 
+ Section V (b) p. 43. 
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residuals after the removal of the sessional term, will be very small indeed compared 
with the standard deviation of this difference, or 3. It is therefore suggested 
that in the combination of correlated observations, os, the average value of the 
jump in estimation between two successive judgments, is of more fundamental 
importance than o. As an example, consider the diagrams of the observations of 
Series X and Series XX in Figure 11; the correlation, p,, is very low in both 
cases, but it is suggested that the physiological significance of the difference 
in type between the two, les in the fact that os for Series X is nearly twice as 
large as og for Series XX, rather than in the difference in the o,’s. Or again in 
the diagrams of the Trisection Experiment, Figure 6, I would emphasise the 
same point in a comparison of the difference between the two highly corre- 
lated Series VIET and XVI. 
Now returning to the coefficients of partial correlation 
Pay z= +529 +109, xz, = — 589 + 099. 


With the interpretation suggested on p. 54 for these coefficients, we are led 
to a rather suggestive conclusion. If we are dealing with a number of series 
carried out at equal intervals of time in the course of one, or even perhaps two 
days, but effectively at one epoch when comparison is made with the long range 
of nearly 70 days covered by the Bisection Series, then the correlation between 
wx and y is positive, or the pencil mark in the later series tends to be made 
further to the observer’s right than in the earlier series; this change is in the 
same direction as the sessional change within a series. There is indeed a curious 
coincidence, on which of course no stress must be laid, 

Yay.z= +529 +109, 75,.2= + 5294 + 0137. 

That is to say the correlation between the mean of a series and the order of 
that series when a number of series are done in close succession, is of the same 
sign and magnitude as the correlation between the mean ¢th observation and its 
order, ¢, in the series. But if we are dealing with all the pth series of sets which 
have been carried out on different days with varying and perhaps many days’ 
interval between, then the coefficient 7,;,, 1s negative, or the bisection-marks on 
the later days have on the whole a tendency to move to the left of the observer ; 
this is in the direction of the secular change. 


The conclusion which it seems possible to draw is this; if a number of series 
are done at very short intervals, the interval of rest between the series will not 
be sufficient to break the effect of the sessional change; but if a considerable 
interval elapses between the carrying out of the series, then the sessional change 
in one series has no influence on the judgments in the succeeding series, but a 
quite distinct secular change may be noticeable. In the Bisection Experiment 
both secular and sessional changes are very small, but they are acting in opposite 
directions. If these two changes are due to different physiological factors, it 
seems possible that it is the fact that they are acting in opposite directions in 
the Bisection Experiment which causes them to be of so much smaller magnitude 
than in the Trisection Experiment, where they were acting in the same direction. 
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(b) The Combination of the Series. 


For the combined series, the coefficients of correlation of successive judgments 
R, for k=1, 2...138 were calculated from 18 correlation tables each based on 
the 1000 combined observations; the results for D;, S;, and R, are tabled below 
(Table XI). The effect of the slight sessional change is noticeable in the increasing 
values of D,. 


Using the values of D,, S, and Ry, and of ,d; from Table X, Equations (vi), 
(vil) and (viii) give = (puoi ciss) and (0,7) for k=1, 2...14. Equations (ix) 


m 


and (x) then give fhe gales of S;’ and R,/ contained in the 5th and 6th rows of 
Table XI. The value of R,’ found by this method should be compared with that 
found with the help of the p,’s, o,’s and o,’s of the individual series, namely 


Pi(piois) 


[Se Se? aE rae 3578 + 0186 See aeer ree (x) bis. 
ope os) 


The difference which is well within the probable error arises from the fact that 
R, has been found by grouping the observations in a correlation table, while the 
pis, os and o,'s were found by direct multiplication of the crude values of the 
observations. 


Another method of obtaining the R,’s is from the first difference correlation 
equations, or the method of Problem 1, p. 41; the results are given in the 
7th row of Table XI, while the constants ,R;, the coefficients of correlation of 
successive first differences required in the solution, are in the 8th row of the 
Table. Comparing the values of R,’ found by the two methods, we find good 
agreement up to k=6, but beyond this point the R,’’s of the second and approxi- 
mative method assume much too large negative values*. It is however evident 
from the results of the first method that R,;’ does become negative, and as it 
could not remain negative indefinitely as k increased, there seems here to be 
another indication that a periodic variation exists among the judgments at any 
rate in a certain number of the series. For a complete period covering from 
20 to 22 observations suggested by the ¥ diagram, R;’ should have a minimum 
value at R, or R,’; the figures suggest that the minimum occurs somewhat 
earlier, at about R,’, but the probable errors for these small coefficients are very 
large. When time is available it would be interesting to examine further the 
significance of this periodicity. 

The points (R,;, k) and (R,’, &) have been plotted in Figure 12. 

It will be noticed that the S;’s in the later groups are larger than in the 


earlier, this suggesting again as in the case of the Trisections, that the obser- 
vations become slightly more erratic towards the end of a series. 


* This result tends to confirm the suggestion made on p. 60 that the difference correlation method 
gave too large negative values for R,” in the Trisection Experiment. 
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(c) Comparison with Experiment A. 


The difference between the results of the two experiments is probably due 
to the fact that the estimation of a half is so much easier than the estimation of 
a third. The variations in the latter observations are all on a larger scale than 
in the former; the secular and sessional changes are very much greater, and if we 
compare the values of the fundamental constants, we find: 


Sy Ss R,! 
Trisection 0845 0732 + 6246 + 0130 
Bisection 0436 0495 +°3519 + 0187 


EXPERIMENT B. BISECTION. CorreLation— INTERVAL DIAGRAM. 


80/7 
Ry e----6 
Rye-=—— =a 
‘70 
Lx 
‘60 NS 
soa 
i) 
50 
Lee 
5 a 
oo 
40 he ena rpeet peo oe. 
ie re 
9 -—-o 
| \ 
XN 
\ 
\ 
\ 


CORRELATION OF SUCCESSIVE JUDGMENTS 
: x a : : 
-) 


i L a I SI sie T =I 
4 i] 6 7 8 9 10 4 12 13 


INTERVAL BETWEEN SUCCESSIVE JUDGMENTS 


= 
nu 
i 


Fig. 12. 


and even after the removal of the greater part of the sessional change—(the best 
fitting straight lines)—the coefficient R,” for the Trisections is +:4892, or greater 
than R,' for the Bisections. The ratio of the values of S;, or roughly 3 to 2, is a 
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measure of the relative uncertainty of the observer in making his estimate in the 
two different Experiments. 

There is some evidence for a slight periodicity in the judgments in the 
Bisection Series; if there is any period in the Trisections it must cover at least 
26 observations, for there is no indication of a significant increase in the values of 


R,, as far as calculated, i.e. up to Rys. 


VIII. ExpertMent C. Counting or 10 SEconDs. REDUCTION OF 
: OBSERVATIONS. 

(a) The Individual Series. 

The values of d,, o, and p, for each of the 20 series are given in Table XII as 
well as the hour and date; the means (d,) have been plotted to the order of series 
in Figure 13. 

If « is the mean in the factor e/p for a series, 

y the order of series, 
z the time in hours and fractions of an hour between 2.0 p.m. on December 
13, and the commencement of series 
we have for the regression lines, 
2 — 9186 =—:006056 (y — 10:5) oa cee eee (xxxvi), 
x2 —°9186 =— 001552 (2 — 38°24) ............... (xxxvl1), 
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Fig. 13. 10 Second Counting. y, Order or Number of Series. 
of which the first is represented in Figure 13. 
The coefficients of correlation are, 
Tey = — 154+ 065, rez =— 775 +060, ry,=+°977 + 007, 
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giving partial correlation coefficients 


Toy c= +022 4°15), Poe. y=— ‘271 +140. 


The secular change corresponds to a gradual decrease in estimate throughout 
the course of the experiment; the value of the factor e/p for a true 10 second 

10-0 : 

1027 9% and this was closely approached by the means 
of the first three series, which were carried out on the first day, shortly after trial 
counts had been made with a watch. No further check with a watch was made 
during the succeeding days, and the length of estimation decreased and _ finally 
appears to have reached a fairly steady value at about *88. The mean for the 
20 Series was ‘9186, or a count of 9°37 seconds. 


estimate would be 


TABLE XII. 
Constants of Individual Series (Counting Seconds). 
| 
Series dy o1 pL fos=oy \/2(1—)| Date (1920) | Time at Start| D,obable Errors of 
- ‘ Coefficients of Corre- 

I ‘9786 | 04030 | + °5283 0391 13 December | (2.30 p.m. lation calculated from 
II | 1:0140  -04331 |+-4988 0434 2 [219 pm. | . a. F a 
III | -9998 | -03844 |+-0625 0526 3.45 pm. | 50 pairs of the vari- 
IV 9446 | :03732 |+ -4027 0408 14 December] (10.15 a.m. ates. 

Vv 9128 | 03394 |+ -4378 0360 e 11.20 a.m. : 7 

VI “9090 | :03015 |+ °5437 0288 “ 12.0 noon 

VII | 1:0070 | -03981 |+-3819 0443 3 2.30 p.m. p P.E 

VIII | 9012 | 02488 + -4550 "0260 ; 3.5 p.m. 

IX | -8886 | -03934 |+ +4326 0419 ‘ 3.35 p.m. =. . oe 
x “9030 | 02851 |+ °5439 0272 15 December | {10.0a.m. -80 + 0343 

XI ‘9130 | :02982 | + 5326 0288 Fe 10.35 a.m. 70 + 0486 
XII "8774 | -01852 | + -2850 0221 rr 11.10 a.m. 60 +0610 
XIII | -9046 | 02402 {+ -4894 0243 < ) 11.50 a.m. 50 | +:0715 
XIV 9464 | -02903 |+ °5085 0288 5 2.30 p.m. “40 +0801 

XV *8880 | :04162 |4+ °7589 0289 “ 3.5 p.m. 30 + ‘0868 
XVI | -8812 | -04947 |4 -8549 ‘0266 16 December] (10.0 a.m. 20 +:0916 
XVII | +8828 | 03945 |+ +6566 0327 + 10.30 a.m. 10 + 0944 
XVIII} -8872 | 02750 |+ 5406 0264 a 11.5 a.n. | -00 +0954 
XIX | +8468 | -02486 |+ 1266 "0329 a 11.35 a.m. a 

XX 8864 | 03345 |+ 6369 0285 55 12.10 p.m. 7 


With the interpretation of p. 54, the insignificant value of the coefficient 
Try.z, Suggests that for a numberof series done in quick succession, there will be 
no change in personal equation; we shall therefore not expect to find any large 
general sessional change in the series. The diagram of mean sessional change is 
given in Figure 14, where % is plotted to ¢. 

The equation of the straight line best fitting the points is 

4 — “919 = + 0000731 (€ — 32) oot. c cee ec cee (xxxvill), 
and has been drawn in Figure 14. 


Using the relations and interpretation of page 48, it is found that 


Ny, =— 9192 ir ‘018 and V1 = My? 2s ‘977, 
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so that the mean sessional change is of even less significance than for the Bisections. 
In fact it is clear from the diagram that the regression line (xxxviii) very nearly 
coincides with the line of mean judgment, y=‘919. 


938h 5 Diagram of Mean, 

934 = .2 Sessional Change i \ 

927 1S Mean og eo i ees 

‘gah ne Ooms ee eee pn Os ee ee 


910} ae 
"906 Lae fee ~--Mean, at 919 % 
coal wee —Regression 7,-919= +:0000731 (¢-82) 
: The value of factor for a true, 10 seconds is 980 


] 10 20 30 40 50 60 


Fig. 14. 10 Second Counting. t¢, Order of Observation in Series. 


The o's have been found for all the individual series, and using the values 
of 8,’ and R,’ given below, we have for the combined series 


S;=S/V2(1—R,’) = 0338. 


The method of correlation of Ranks gives 
To, py = +329 + 135, 
showing again that large variation is associated with high correlation. 


In Figure 15 are given eight representative series graphs which provide a good 
illustration of the variations in judgment. In the first two graphs (I and ITI), 
os 18 large and there are many sudden fluctuations, but in the later series og is 
lower and very constant in value. What may be described as the smoothness 
in the change of judgment is in some cases particularly noticeable ; for example in 
the stretch between 

Yu and y;;, Series VI 
Yy. and Y», Series XII 
Ys, ANd Ys, Series XV 


In making comparison with the similar diagrams for Trisections and Bisections 
allowance must be made for the differences in scale, but I think it is clear that this 
“smoothness” or gradual variation is a special feature of the 10 second counting ; 
there is for instance no diagram of Trisections or Bisections which can compare 
with that of Series XVI of the counting, for high correlation combined with very 
gradual variation. But such a result is not unexpected, if the procedure of the 
experiment with the continuous counting be remembered. 


A further point of interest is to examine how far a sudden “break” or discon- 
tinuity in the length of estimate influences the succeeding judgments. Among the 
1000 observations forming the Groups 1 of the 20 series there are 61 “breaks” or 
differences between successive judgments of -07 or over (in terms of the factor e/p), 
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ie. of over twice Ss, the standard deviation of first differences. In the diagram of 
Figure 16, ten observations are represented; the break between y—, and y, or 
Y~ Yr, 18 supposed to be equal to, or greater than, ‘07. If this large break 
influences the succeeding judgments, it is to be expected that the differences 
Yi™ Yt. YoY, --- etc. will be smaller on the average than the differences 


Ye Yi-1, Yt © Yi-2, »-. ete. 
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The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. 


Fig. 15. 10 Second Counting. Diagrams representing variations in judgment. 
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In the first row of Table XIII are given the standard deviations of these 
differences taken from the 61 breaks ; now in 14 of these cases there is what may 
be called a “double break,” that is, after making one large variation to y%, the 
judgment returns approximately to its previous state, both y~ yy and y~ Yeu 
being greater than or equal to ‘07. While such cases may represent true variations 
in judgment, it is very possible that they result from some accidental error, a 
slowness in pressing the tapping key or in catching up the counting at the com- 


Fig. 16. Experiment C. Effect of a large break in judgment. 


mencement of the observation, which was realised at the time and was not due to 
a real change in estimate. In the second row of the Table, therefore, are given the 
standard deviations taken from the 33 sets where there was no double break, and, 
in the third row, the standard deviations of Ist differences (taken from the whole 
1000 judgments) 

between y; and y4, = S3=S,' V2 1 — Ry), 


i yt and Ytro =S/V2(1-R,), 
. yt and Yr4s = S, V2(1 RY) ete: 
TABLE XIII. 


Standard Deviations of Differences between Judgment after “ Break”, yr, 
and the Judgments yr-¢ to Y+6- 


e | No. of Row 


Previous Judgments Succeeding Judgments 
Number of - 
Judgments rae Gas | ae 
Yt-6 Yt-4 41-3 Yt-2 Yt-1 You Yir2 Yrs | Yer Yt+6 
From 61 sets 0692 | 0647 0624 | -0624} °0851] -0476 | °0541) °0553| -0549| 0623 
+ 0042 |-+ -0040 | + 0038 | + :0038 | + °0052 |+ -0029 | + 0033 | + 0034 | + :0034 | + 0038 
From 33 sets ‘0757 | -0682) 0636] ‘0667] -0810] 0346} -0421 ‘0508 | *0483 | °0635 
+ :0063 |+°0057 | + 0053 |+ 0055 | + 0067 | + 0029 |+ -0035 | + -0042 | + 0040 | + 0053 
From total 1000 0442 | *0416 | ‘0401 ‘0373 | °0338} 0338 | °0373|) °0401 | °*0416) 0442 


The probable errors are calculated from the usual expression, + °6745 o|N2n. 


If we consider the values of these standard deviations together with their 
probable errors, we may say definitely that the effect of a large break or discon- 
_tinuity in judgment is quite significant, and that the influence appears to last for 
at least four or five judgments. It cannot of course be decided whether the 
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breaks were caused by some chance external factor, or were due to a conscious 
change in estimate made by the observer on deciding, whether rightly or wrongly, 
that his second count was too short or too long*. 

It will be noticed that the standard deviations of differences between these 
special pairs of judgments are in all cases greater than the corresponding standard 
deviations from the total 1000 judgments; this is to be expected, for the judg- 
ments y from which all the differences are taken are not a random selection of 61 
(or 33) judgments, but include many of the most erratic and therefore those 
furthest from the mean. 


(b) The Combination of the Series. 

In combining the twenty series, D;,, S;, and R, were calculated from the thirteen 
correlation tables of the judgments, and the values of these constants are given in 
Table XIV below. A glance at any one of the correlation tables showed that the 
1000 judgments in any group did not follow a normal distribution, and in order 
to get a measure of this, the coefficient of skewness for the 1000 judgments in 
the combined Groups 1 (i.e. for the judgments 4%, y... yy of the twenty series) 
was calculated from the expression 
VB, (Bo + 3) 

2 (58: — 68; — 9)’ 


where 8, and 8, are the fundamental ratios of the moments about the mean given by 


Skewness = 


The result was as follows : 
Bi 02056 Ga 2. 9739, Skewness = ‘3684 + °0339, 
showing a very significant degree of skewness, and the frequency follows a Type I 
curve of limited range. 


The distribution of these 1000 observations made within a period of four con- 
secutive days, gives but another example of the frequent inapplicability of the 
Normal Error Law. 


Using the values of p,, 7, and o,, R,’ is obtained from 


2 > (pi0162) 
= = +°5200 +0156 ............60. x) bis, 
Tse See) Se 
and the remaining values of R,’, /=2,...13, by the approximate method of 


Problem 1, p. 41. Perhaps the chief source of error in the method is variation 
in S;, which has been assumed constant; in this experiment the range of Sj, is only 
18% compared with 3°6 % for the Re cec ons and 2°5°/ for the Bicone and the 
results which are contained in the 6th row of Table XIV may be regarded, there- 
fore, with reasonable confidence. As before, for the higher values of &, Ry’ may be 

* Hleven definite interruptions in the ordinary routine of counting, due to a mistap on the key or a 


miscount of the 10 seconds, were recorded at the time of observation, but only three of these resulted 
in breaks of judgment >-07, the limiting value taken in the above investigation. 
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a little too low, and as a test of the amount of cumulative error which may be 
affecting R,,’, | have worked out this constant directly from the relations 


rel Oy A y I — 
R,; S, S14 ae > (D, _ d,) (Di aaa das) 
m 


ay 7, 1 9 9 Y /o il a /3 9 
She = Sh 2 We > CD; = aay Sie = Si te a = (Dy am dys)’, 


me 


EXPERIMENT C. 10 SECOND COUNTING. CorrELATION-INTERVAL DIAGRAM. 


CORRELATION OF SUCCESSIVE. JUDGMENTS 


“40 
20H 
5 ss I I I I = at I Sean ea a St we 
(0) 1 2 3 4 5 6 7 8 93 40 n 12 13 
INTERVAL BETWEEN SUCCESSIVE JUDGMENTS 
Fig. 17. 


with the following results : 
Ry =— 0124+ :02138, L,,=+°632*, 
Si 3420: i = 03444. 
R,; and presumably R,,’ are not therefore significantly negative, and it seems 
probable that R,’ tends to zero as i: increases, without oscillating about that value. 
The points (, Ry) and (k, R,’) are plotted in Figure 17; the theoretical curve 
drawn in the diagram will be referred to in Section XI. 
* Ly, or the limit to which R, approaches as R,’ tends to zero is discussed on p. 34, 
Biometrika x1v 6 
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IX. Experiment D. ESTIMATION OF 10 SECONDS. 
REDUCTION OF OBSERVATIONS. 

(a) The Individual Series. 

In Table XV are given the values of d (the mean of the 63 observations of a 
series, not those of Group 1 only), and of o, and p, for the individual series; the 
low values of p, will be noted at once, and also the high values of o, compared with 
those in the Counting Experiment. In Figure 19 below the means have been 
plotted to order of series, and if 

x is the mean in the factor e/p, 
y the order of series, 


z the time in hours and fractions of an hour between 10 a.m. on December 7th, 
and the commencement of series, 


TABLE XV. 


Constants of Individual Series (Estimate of Seconds). 


Series d | o1 PI Time of Start Date (1920) 
| 
| I 1151 “1217 +1518 + 0932 10.45 a.m. 
| II Tell] | +1254 + °2332 +0902 11.30 a.m. 
Ill 1:109 "1330 — 0249 + 0953 12.10 p.m. 7th December 
IV | 1°052 | +1393 +°1803 + :0923 2.0 p.m. 
V ‘973 "1292 + 2632 + ‘0888 3.0 p.m. 
VI HENGE) *1349 +1300 + 0988 10.15 a.m. } 
VII 1-011 1312 + 3673 + 0825 11.0 a.m. 
VIII 1:073 1318 +1631 +0929 2.0 p.m. 8th December 
IX 1:003 | +1108 +1976 + ‘0917 2.30 p.m. 
x 1°089 “0989 +:0380 + 0953 3.15 p.m. 
xl 1204 | *1519 + 3405 + ‘0843 10.0 a.m. 
XII 1°204 “1467 +°1415 +0935 11.0 a.m 
XIII 1:091 ‘1166 +°3241+°0854 12.0 midday 9th December 
XIV 1:0386 “1059 + 0566 + 0951 2.0 p.m. 
XV 1132, |) +1884 +°4814+°0733 3.15 p.m. 
XVI 1°170 *1500 +1036 + ‘0944 10.0 a.m. 
XVII | 1:°421 *1520 — 0834 + 0947 11.0 a.m. | 
XVIII 1°300 “1591 +2314 + °0903 12.0 midday 10th December 
X1X 1243 +1708 | +:2260+-0905 | 2.0 pm. | 
XX 1°170 1833 +°1659 + 0928 2.45 p.m. 
| 


Correlation between p; and o,, 7o, p,= +°176 +146 (calculated from correlation of ranks). 


we have for the regression lines 


# — 11833 = + 01018 (y — 105) .......... hee (xxxix) 


e — 1:1833 = + °002493 (z — 38:62)... . ssc cceenseceor an (xl). 
The coefficients of correlation are 
Te = +'638 + °089, 


Tey aly 562 ete 103, yz = ar 983 aE 005, 
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giving partial correlation coefficients 
Yue y = 510 £7102, rey 2=— 470 4 “118. 


These latter coefficients suggest that the secular change for observations spread 
over a number of days will be a lengthening in estimation, but that, if a number 
of series are done in rapid succession, the tendency will be for a shortening; in 
fact we should expect the sessional change to be in the opposite direction to the 
secular, as for the Bisections. 


116-—= | 
112-2 fe 
2 
ros.’ & 
ND) 
1044 ~o 
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so 


10 30 : 30 


Fig. 18. 10 Second Estimation. t, Order of Observation in Series. 


The values of 7% have been plotted in Figure 18; the best fitting line has not 
been calculated, but it would certainly correspond very closely with the mean, 
y =1:1333, There is in fact apparently no mean sessional change, though the drop 
in the last eight values of % may be significant, and a mark of the tendency 
suggested by the negative value of rz,_-. 


In Figure 19 the centres of the small circles represent the positions of the 
means-of the 63 observations of each series; these points have been fitted with 
the cubic 
« = 1:093971 +°022116 (y—10°5) + 001174 (y—10°5)*—-0002002 (y—10°5)*...(xl1), 


which is the middle of the three curves. There is evidence of a slight secular 
change, the length of the estimation increasing towards the end of the experiment, 
If however it is remembered that the 20 series were carried out in 4 days, it will 
be seen that there is in general a decrease in estimation in the course of the 5 series 
done in any one day. It is this daily drop that the coefficient 7,,,(=—°470) is 
picking out. Now in addition to the secular change in personal equation, the 
figures in Table XV suggest that there is also a secular change in standard devia- 
tion. The vertical lines on each side of the series-means in Figure 19 equal in 
length the corresponding standard deviations, or o,’s._ These values of o, have been 
fitted with the cubic 


w' ='129006 +:001072 (y—10°5) +:000302 (y—10°5)?+ 0000214 (y—10°5)°...(xlii), 


and the other two curves in the diagram have ordinates equal to #+a’ and #— 2’, 
so that the distance between the central curve and either of the outer curves, gives 
the smoothed value of the standard deviation at the point. The diagram provides 
a generalised representation of a secular change in personal equation and standard 
deviation. 

6—2 
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The factor for a true 10 second interval would be = 28 and was most 
nearly approached by the means of Series V, VII and IX, while in the case of 


XVII the mean estimation nearly reached the high value of 15 seconds. 


DisTRIBUTION OF PERSONAL EQUATION IN ESTIMATING SECONDS. 
2-004 


ESTIMATE QF 1 SECOND 


“TDH 


‘SOK 
CP ly DC (SP ara a Ce ed ee a 
fa) 


4 2 3 4 5 6 7 8 9 100 HW 1 1 #14 #1 16 #17 «+18 #19 20 
PLACE OF SERIES IN ORDER 


Fig. 19. 
(b) The Combination of the Series. 


In combining the twenty series, D,, S, and R, were calculated from the thirteen 
correlation tables of the observations of the combined series. Using the correlations 
and standard deviations of the separate series, R,’ is obtained from 


= (p101%2) 
~ = 1984140204904. ae x) bis, 
"Tans (o.") a ) 
and S/='14101, S,’='14056. 


Then using this value of R,’, and the first difference correlation equations 
(Problem 1, p. 41), R,’ can be calculated for k=2,...12. The values of these 
quantities are given in the Table XVI below. 
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The fall in R, is small, and although there is considerable irregular variation 
from R, onwards, it appears that R,, will not vanish as k increases, but approach a 
constant value in the neighbourhood of +°35. This can be tested; we have from 
the equations (vi) to (x) 

PCY os: yy I 2 
R; D1 S al ae m = (D, = d,) (Dig oe dns) 


Ri = a i =e (xl), 
/ {sie a x (D, = ayy {Seu aa > (Deri 7 deny} 
TABLE XVI. 
Constants of Combined Series (Estimating Seconds). 


1:1421 11440 | 1:1415| 1:1413} 1:1421] 171423) 1°1416] 1:1402 | 1:1395 


14101 | -14056. 
-00213 | +-00212 | 


19841 }+°1123 | +:0652 +0570 |+:0338 |+ 
02049 | £°0211 | +°0212 | +0212 |+ 0213 | + 


4463 |—-0243 | — 0243 |+°0094 | — ‘0135 | — 0229 +:0160 |+ ‘0598 


er —_ |i = 


| 
k=1 2 3 4 5 6 7 8 9 | 10 | ii 12 
| | | 


111391 | 1:1382) 1:1378 


1749 1749 | 1759) 1761} +1764) +1760; 1754) +1757} +1763) +1756) “1760| °*1767 
0026 | +:0026 |+:0027 +0027 | + °0027 | + ‘0027 | + °0026 | + 0026 +0027 | +:0026 | +:0027 | +:0027 


| 
| 


"4825 4269 '3965| 3913] 3764] -3755| -3983| -4045| -3488 | 3691) °3524) °3831 
0164 |4°0174 | +°0180 -+-0181 |+°0183 | +°0183 | +-0180 |+°0178 | +0187 +°0184 | +°0187 | +°0182 
| | | 


0332 | + 0693 | + 
0213 | +°0212 |+° 
0 


98 — 0056 | + 0267 +:0017 | +:0501 
2}+:0213 j= 0218 + 0213 | +0213 
0734 |+:0357 | — 0458 


| 


Ss=S V2 (1—R)='1785. 
and as the sessional change for the series is very small, we may make the approxi- 


mation 
= (D,— d,) (Dey — des) = & (D, — hh)? = = (Dees — diss)? for all values of k, 


and in view of the constancy of S;, 
Si = S'x41 for all values of k. 
Then on the assumption that there is no significant periodic variation in the 


observations, 
R, — 0 as & increases, 


i = (D,-— ay 


Mm m 


7 pape = + °354., 
S\t+— 3 (D,-d,y 


and from (xlii1) R;,—> 


The correlations R;’ become rapidly insignificant; the values tabulated are of 
course subject to the errors of the method of approximation, but as inthe case of 
the 10 second Counting Experiment, these should not be large owing to the 
constancy of S;,*. The points (k, R,) and (4, R,’) are plotted in Figure 20; the two 
curves there drawn will be referred to in Section XI below. 


* The difference between S$; and Sj; is one of 1°4°/, only, 
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CORRELATION OF SUCCESSIVE JUDGMENTS 
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(c) Comparison of Kaperiments C and D. 


It has been found that in both the Counting and the Estimating Experiments 
there is evidence of a secular change in personal equation, and that in both cases 
the tendency is for the estimates to depart further from the true value of 10 seconds 
in the later series; in the Counting Seconds there is a decrease, in the Esti- 
mating Seconds an increase in length of estimate. There is also very little evidence 
of regular sessional change in either experiment. 
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EXPERIMENT D. 10 SECOND ESTIMATING  CorrELATION— INTERVAL DIAGRAM. 
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Beyond this the similarity ceases; it is only necessary to compare the values of 
the chief constants (defined on p. 36), 


ihe Ss Ree 
Counting 038438 = 0888) +. 5200 + (0156 
ieee 1410 ‘1785 +:°1984 + -0205 
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The variations in judgment in the Estimating Experiment are very large com- 
pared with those in the other, and at the same time there is low correlation 
between successive judgments, so that the observations will be found to be scattered 
far more nearly in accordance with the Normal Error Law than in the three 
preceding experiments. In the case of the Counting Experiment, the skew distri- 
bution of the 1000 observations has already been referred to. But for one or two 
exceptions (as III and XIX) the individual series in the Counting conform more 
closely to a general type than in the Trisections or Bisections, and this results in 
the very smooth values of the constants R; and R,/. 


X. EXPERIMENT #7. PLATE MEASUREMENTS WITH ZEISS COMPARATOR. 


The values of p, only have been calculated ; these, with o, and a brief description 
of the nature of the marking measured, are given in the Table XVII; o, is in 
millimetres. Series I—VIII involved settings of both slide and micrometer, IX of 


micrometer only. 


No great weight can be attached to the result of one series of 50 readings on a 
marking, but it is justifiable to draw certain conclusions from the results of the 
eight series. In the first place, there appears to be a significant correlation between 
the successive measures of the edge of a band (I-and IT), but in measuring the 
centres, 1.e. in bisecting a bright maximum with the cross wire, there is on the 
whole no correlation. This perhaps might be expected; the edges of bands or 
maxima in photographic spectra are not quite sharply cut, so that some uncertainty 
must exist in the observer’s mind as to where the real edge should be taken to be ; 
his opinion on this point may vary throughout the course of the sitting, and con- 
sequently correlation will be found between the successive readings. On the other 


TABLE XVII. 


Series Pi v1 Description of marking 


I + 384+ -081 | ‘0016 | Sharp edge of bright band 


Edge 


II + 467+ 075 | -0015 | Slightly vaguer edge than I 

Til +°117+°094 | ‘0007 | Clear and narrow maximum ) 

IV | +-090+-095 | ‘0012 | ,, ¥ . 

V 4+-021+°095 | 0016 |, z Gane 

VI | —-001-4:095 | -0019 | Broad and obscure _,, Sean 
VII | —-050+-095 | 0022] , ,, soft . 
VIL | +-227+-091 | -0041 Serva t: Te . 

IX +°'288 + 087 | ‘0004 | Micrometer screw settings only 


hand, in the bisection of a narrow maximum, there will be little doubt as to the 
position of the centre; the real estimate of the observer will vary but slightly, and 
the variations in the reading will be due mainly to failure in breaking off the push 
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or pull of the slide at the right moment. It is possible that unconscious “over-pulls” 
or “under-pulls” may go in runs together, but the measures seem to show that 
this is not the case, and that the correlation of successive judgments is due rather 
to correlated changes of mental estimate than to those of a more physical character. 
If it were more difficult to bisect a maximum, if there were greater opportunity for 
variation, it is probable that there would be a correlation of successive judgments, 
and this is perhaps illustrated by the case of Series VIII, which has the largest 
standard deviation (0041) and also a correlation (p, = + ‘227 + 091) possibly 
significant. 


The result of 1X suggests that there is a correlation between successive settings 
of the micrometer wires in the second eyepiece; this correlation would of course 
enter into ‘the results of I—VIII, but the standard deviation of IX (:0004) is so 
small that the effect will be insignificant where the variations in slide settings are 
large. 

As a matter of practical application these results serve to emphasise the 
importance of the routine of measurement usually adopted; if, for example, it is 
proposed to take four readings of each of a number of markings on a plate, the four 
readings should not be made in succession, but all the markings should be measured 
once, and then perhaps a short interval taken before the second measuring is made, 
and so on. This method should eliminate the error in the mean of several measure- 
ments of a marking, which may arise from a correlation of successive judgments, as 
well as errors due to change in temperature of instrument or plate, ete. 


XI. ANALYSIS OF THE CORRELATION BETWEEN SUCCESSIVE JUDGMENTS. 


(a) The Theory of correlated Estimates and accidental Errors. 


It has been seen that in the case of the Bisection and Timing Experiments 
when the secular term was removed the coefficients of correlation of the successive 
judgements, or the constants R,’, diminished to approximately zero values as &, the 
interval between the judgments correlated, was increased. In the Trisection Ex- 
periment, owing to the marked sessional change which was repeated in practically 
all the series, R;’ appeared to approach a value of +°16 and not zero as k was 
increased ; the sessional change in this case appeared to be of parabolic rather than 
linear form, and it seemed possible that if the ordinates of the “best” fitting 
parabola of each series were removed from the observations, the coefficients of corre- 
lation of the residuals, or the R,’’s, would tend to zero as k increased, as in the 
case of the other three experiments in which there was no large sessional change. 
The points representing the values of Ry which have been plotted in Figures 8, 
12, 17 and 20 appear on the whole to lie so nearly on a smooth curve, that it is of 
no little interest to inquire whether we can obtain equations to such curves based 
on some definite theory of the physiological factors underlying the variations in an 
observer's judgment. 
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In the first place we have seen that neither a secular change in personal equa- 
tion—the variation in series means—nor a simple sessional change such as that 
represented by the straight line or by a second order parabola considered in the 
Trisection Experiment, will account for the whole of the correlation of successive 
judgments. We must therefore conclude that quite apart from the large scale vari- 
ations in judgment which are due to the more gradual changes of state in the 
observer resulting, perhaps, from experience or fatigue, there is a definite relationship 
between the small scale variations in judgment; if judgment % is greater than the 
average of the five or six preceding judgments, then we shall on the whole 
expect that y4,, the next judgment, will also be greater. I propose therefore to 
consider what results will follow from the assumption that y, has a correlation r 
with y_; and y4,, but that for y.. or yz, constant it has no partial correlation with 
Yt-2 aNd Yr4. Or Judgments at greater intervals. In other words we will suppose 
that the observer’s estimation at any moment is only influenced by the preceding 
estimation, and only through this, and not directly, by the earlier estimations. 


Let us take the successive judgments y, Yrii, Yrs «++ Yt --- and suppose that 
the total correlation between y; and y,4 18 pz, Where k= 1, 2, 3,..., and p,=r. If 
there is no partial correlation between y; and y45, Y41 being constant we must 
have 

po— pr =0 or pp=r". 
In the same way if there is no partial correlation between y and y,4,; when y,4, 
(or #42) 18 constant, 

Ps — Pip2= 0 or (x= Tae 
and in general we find that 


In reaching this simple result there is a point however that has been overlooked ; 
it has been assumed that there is some physiological or psychological significance 
in the correlation of an estimate of a quantity and in the preceding estimate, but 
it must be remembered that the value which the observer records may not be 
exactly that which he wished to record, or in other words he may be unable to 
record his true estimate. Thus in bisecting a line it is likely that the pencil point 
will not strike the paper exactly at the spot intended, or in counting 10 seconds 
the tapping of the key may not be exactly synchronised with the beginning or end 
of the count, and there may be many other little external influences of which the 
observer is unaware, which will all combine to form what may be termed an acci- 
dental error superimposed upon the true correlated estimation. Let us examine 
how the relation (xliv) will be modified by introducing the idea of these accidental 
and uncorrelated errors; we must suppose that the observer’s recorded judgment 
yt is made up of two parts, a his actual estimate at the moment of record and £, 
some complex of accidental errors affecting his record. Then 


Ui Ofte Ops sucess ons ites cieNe d's cru te edhe) 
Now if we assume that the accidental errors 8; are as like to be positive as 
negative, and that they will not be correlated in any manner among themselves 
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nor with the fundamental part of the judgment , we shall have the following 
approximate relations 


> Beat i neko re ee. where NV is large compared 

1 with k 

N 

> Bi Bere = 0 ” Tae (xlvi) 
NV : where k and k’ take any of the 

D Pip Opn = 0 ” »” ( 

i values 1, 2, 3,...ete. 


But the correlation between successive values of the y's at intervals of & is 


S (a+ Br) (O44 + Brrz) -— NV st a+ Bes S Mth + Pere 


gee EEE eee 
k 5 N = = aN —<— 7 = = = 7 - 
a F N a+ 2 Nigoa ee 
2 (a; 4-8) = NV (3 > wt | > (ore + Beye)’ — w( sat ee | 
=1 t=1 Ne ee 
y Na NX a 
Ya) Ne 
= Oey k Pas ara a) 


{Bae (3 § ‘+e Be Sa :—u(S mt 3 8 
t=1 t-1N : cae t-1 N He 


in view of the relations (xlvi) 


where . dr4z] 18 the first order product moment coefficient referred to mean of the 
successive a’s at intervals of /, 


and V@2 is the standard deviation of a%, Ogi, --- Urn» 


JBy »” ” ” Bx, Bru, eles Bitn» 


and Jae CEBE 


Now unless there is a steady sessional change in the a’s, we may assume that 


” ” ” Uk» Yketiy «+ YELN: 


for large values of V 
fo) 
7 =...= a’, say, 


and similarly unless the accidental errors are steadily increasing or decreasing in 


magnitude ~ a = 
BY =B2= see == (3?, 
A Ob a ara Bats 
and we have Pk = [meee] _ aa ul : ate] = +Tazjagy pe 
a+ 8 aye 4 a+ B 


But on the assumption made above of zero partial correlation between two 
estimates which are not consecutive, we have found that rq, a,,,, the correlation 
between the observer’s real estimates at intervals of k, can be expressed in the 
form r*, and therefore 
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where g is a constant not depending on the interval &. With this expression for 
the correlation we shall of course find an apparent partial correlation between the 
judgments at intervals greater than one; for example the partial correlation 


CR ops 
r= 


22 


, and does not vanish unless 


between y and Yr4, Yr being constant, is 


q=1. According to the theory suggested this is however a spurious correlation 
due solely to the presence of the accidental errors. 

The next problem is to inquire how far a relation of the type of (xlvii) 
will fit the correlation coefficients which have been calculated for the Experiments 
A, B, Cand D. In the first place, in order to get as smooth values for the 
coefficients as possible we must combine the 20 series, which we may do if we 

_ remove the secular change as represented by the variation in the series means ; 
this step is clearly necessary for we are considering the relationship between 
judgments made in close proximity and are not concerned for the moment with 
the variation in personal equation from day to day. We must therefore deal with 
the coefficients of correlation R,’ and endeavour to fit a curve z= qr® through the 
points «=k, z=R,’. I will consider the different experiments in turn. 


(b) Application of oe to results of Hapervments. 

Haperiment A. 

The curve represented by z = qr* is asymptotic to the a axis (as 7 <1), so that 
if it is to fit the points (4, R,’) it is necessary that R,’ should tend to zero as k 
increases. But the values of R,’ given in Table V, p. 58, appear to tend as /: 
increases, to a limiting value between + °16 and +°15 rather than to zero. 
I think that this results from the marked sessional changes which have been 
represented in mean form by a second order parabola (see Equation (xxx) and 
Figure 4), and that if there is a physiological significance in the distinction 
between the sessional change and the residual variations of the observations when 
freed from this change, it will be of interest to find out how the coefficients of 
correlation of these successive residuals—what have been termed the R,”’s—fall 
off as the interval or / is increased. Should it be found that the R,”’s follow 
the law 


Wo pt 

. R; Gt ) 
the argument in favour of distinguishing the sessional change from the residual 
variations will be strengthened. 


It was found that the values of R,’ given in Table V could be fitted closely by 
a curve of the form 


FJ) aS | fd ee ee ee EE Te (xlvii1), 
where p, g and 7 are constants. 


A rough trial gave the following approximate values: 
Dy = LOW, “Qp =. 09)! Tot 3 
Now if z=f (p,q, 7) 
=f (Po, 95 7) + Spa i - 87 4 i ie or A to first order, 


=pot+ Goro a op as ie 54 Fr ie M or, 
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we have as equations of condition for a least square solution 
op +175" 0g + kqury’ or = Ry — 9, — oto’, for hile, ae Le: 


Using the values of p,q and 7) given above, the corrections dp, 8q and &r 
were calculated and gave finally as the best fitting numerical equation, 


Ry! ="15244- 6817 (7105) «0-2. 8. eee (xlix) 
TABLE XVIII. 
Values of the R,’s for Trisection Haperiments. 
l 2 3 4 5 6 7 8 
| ieee | Values obtained from 
| R,’ R,’ __— Difference) Probable (Jii) on assumption of R,” 
k | (direct (from equation | Col. 2— | Error of constancy of G;, (from equation 
| calculation) (xlix)) | Col. 3 R, ee (lvi)) 
| | Sy R,,” 
6) = + °834 — — | — —- + °804 
1 +°625 | 15837) — ‘012 +:013 | °:0778 + °550 ‘571 
2 523 | “497 | +026 +016 ‘O776 431 “406 
3 388 397 | — +009 HOS ‘0778 268 288 
4 315 B26 — O11 +°019 ‘0781 183 205 
5 281 276 + 005 + 020 ‘0778 142 146 
6 | 5230) 240 — 008 + 020 ‘0782 084 103 
7 222 215 + 007 + 020 ‘0782 ‘O71 074 
8 ‘191 197 — ‘006 + 021 0783 035 052 
9 “165 184 | —:019 + ‘021 ‘0787 “006 037 
10 183 175 +008 + 021 “0802 031 026 
1] 168 168 “000 +°021 0823 ‘O17 ‘019 
12 172 164 +008 +:°021 “0834 023 013 
13 +160 +160 “000 +021 “0840 +°009 +:°009 
14 a — — — "0840 -- — 


In the second column of Table XVIII are given the values of R,’ taken from 
Table V and in the fifth column their probable errors; the values of R,’ given by 
equation (xlix) are in the third column, and in the fourth are the differences 
col. 2—col. 8. It will be seen that the fit is a good one, the difference being only 
greater than the probable error in the case of R,. The points (4, R,’) and the 
curve of (xlix) are shown in Figure 8 (p. 64). 


The problem before us is therefore this; can we explain the constant p in 
equation (xlvili) in terms of the sessional changes? We have seen that the mean 
sessional change for the 20 series can be represented by a parabola of the second 
order, but we must allow for a different change in each series. Let us suppose that 


y =S p(t) 
will represent the sessional change in the pth Series after the secular term 
represented by the series mean has been removed, so that instead of equation 


(xlv) of p. 89, we have 
yt =fo H+aut+ Bi=fp (+ Y, 
where Y,=a, + £;. 
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Then if } indicates summation for the m (or 20) series, n = 50, the number 


m 


of observations in each group of a series, and k takes any of the group numbers 


1, 2,... 14, since y = f, (t) will be the “best” fitting curve of its type > Vag 0 
t=1 


approximately, and on combining the m series 
> (ye) = 0 
m t=1 
Again we have no reason to suppose that there will be any correlation between 
the sessional term /, (¢) and the residual Y;, so that 
SE (Pao +¥ 1 =0 
m t=1 


for all values of & and k’ between 1 and 14. 


As y;' is freed from the secular term, using the relations above we have that 


EE (fo + Vd) fo (E+ &)+ Vena) —mnd & SE : a) 


/ m t=1 { mn m = 1 we mn 
R; MVee GeO ae abi eho] 
a/ [5 > (fp (t)+ Yi mnj® a mies (fo (t+h)+ Vers) — mn} S Joh ot | 
m t=1 nie 1 mn m t=1 m t=1 nun 
ay eee eee aera (li), 


= Re Si Siu ag F, 
VS? + G2) (Sea? + Ginn?)’ 
where R,” is the coefficient of correlation between Y; and YV;,,, 8,’ and S;4,.” are 
the standard deviations of the Y’s of Groups 1 and &+1 (see (xi) and (xii) on 


page 35), and 
mn = Sh (t) fp (t +h) - 12 3 oO { 3 Jo aa 


mn 5 1 t=1 MN) (mini mn 
1 2 fp(t+k—1))? 
ao Pile iy tS ee 
Ce rae m t=1 X (fol m Nae s t= mn | 


It will be seen that G;, is the standard deviation of the ordinates of the curves 
representing the sessional changes, y = f, (¢), which correspond to the observations 


is the correlation of these successive ordinates at 


F, 
G, Gar 
intervals of . If the sessional changes were linear this correlation would be unity, 
and a little consideration will show that if the sessional change in each series can 
be represented by a curve of gradual bend, the correlation will not be far from 
this value. For example in the case of the parabola (Equation (xxx), p. 47) 
which was fitted to the mean sessional change and is drawn in Figure 4, it is 
found that 


in the kth groups, while 


Fry 
GG 


We shall therefore make no great error in assuming that 


F,= G1 Gis, 


= + 994. 


94 On the Variations in Personal Equation 


and it follows that the relation for R;’ can be expressed in the form 


R; = ain + = aaa ea Revd (111) 
V (+ am) 0+ G25) AO + ga) 0+ ge) 
= pith, Ry’, 
which must be compared with the relation 
Re = Dt Qu te aa eee (xlviii) bis, 
where , p= 1524, g—63sli, r= 7105, 


that has been found empirically to fit the actual values of Ry’. 

If the expressions p; and J, were constant for k= 1, 2... 14 an interpretation 
of (111) would be at once suggested. Namely that R,,”, the coefficient of correlation 
of the successive residuals Y; and Y;,4, left after the removal of the- secular and 
sessional changes is expressible in the form 


Be SQ io ones o's oe dee Se oe (1111), 
that is to say, making allowance for the presence of accidental errors, the law of 
relationship between the successive estimates suggested on p. 90 above, holds 


good. Now without finding the curve which represents the sessional change in 
each series we do not know the values of 8; and G,. We have however that 


S;”? + Ge = Sx? o auh torn eo sei ale atelatetalasera oll tere eee (liv), 


where S;’ is the standard deviation of the observations in the kth groups after the 
removal of the secular term. The values of S;' are given in Table V, p. 58; 
they are seen to increase as i increases and therefore p, and d, can only be 
constant for all values of /: if 


S. M9 S M9, Ss M9 

>} D2 S14 | 
——— a IR Rin ond otivo 6 v). 
Ge G2 G2 oy) 


That the relations (lv) should hold approximately is not at all improbable; for 
with a sessional change of the parabolic form of the curve (xxx) illustrated in 
Figure 4, the standard deviations of the ordinates in the later groups will increase 
owing to the increasing drop of the curve towards the end of the series while S;’ 
may increase with / owing to greater variation towards the end of a session. 


In fact for this particular mean series with its sessional curve represented by 
(xxx) 1t is found that 
G, ='0336 ins., Gi, =°0406 ins., 
while 8S,” =:0165 ins. §,,’ =°0201 ins, 
that is to say, the variations superimposed upon the main sessional change (the 
distances of the points plotted in Figure 4 from the parabola) become greater 


towards the end of the series when the observer's judgment perhaps became more 
: é mesh Gs 
erratic as he grew tired. These values give — = -49, ote 

Cr, Gis 
the relations (lv) do hold very closely. What we find therefore in this typical mean 


= 50 suggesting that 
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series represented by Figure 4 may well be expected to hold approximately in the 
individual series. 
If then p;, is constant for k= 1, 2,... 14 and equals p, we find readily from 
equations (li1) and (lv) that 
,=1l—-—p,=1—p, 


and hence (1—p) R= qr* or Ry’ == Tp, 


0) 

Making use of the numerical values p=°1524, g = 6817, r= "7105 we obtain 
finally 

Fi COA BK HOD Ee meevereeiuaesan ree ees (Iv1), 

as the theoretical expression for the correlation of the successive residuals after 
the observations have been freed from secular and sessional change. This curve is 
the lower of the two curves drawn in Figure 8. The points which are there 
plotted about this curve are the points (A, R;,”)* obtained from equation (Iii) 


(a) On the assumption that G? = G,?=... = G.2 = constant, 


r/ (1 at a3) (1 a Hs) 
G? Gif 


(c) Making use of equations (liv) and the tabled values of S;’. 


= Py = ‘1524, 


The close fit of the curve to these points shows that the manner in which the 
values of R,” fall off as & increases is not much atfected by the different assump- 
tions regarding the relations of the S;,”’s and the G;’s made in the two cases +. 


Experiment B. 

Reference has been made on p. 71 to evidence for a slight periodicity in the 
observations of this Experiment, which gives rise to small but apparently signi- 
ficant negative values to R,’, for &S7. Further investigation might enable a 
correction for this periodicity to be made, but at present it is not possible to 
express R,” with exactness in the form 

R, = qr 

For the purpose of comparison with the other experiments we can however 

obtain values of gq and 7 which will give a rough fit for the first few values of Ry’. 


Thus if we take 
pa Lea Vlaa 
we get the values 
Rode = 04 Rs Ry 13, 


which agree roughly with the actual values given in Table XI, namely 
Ry— soz, R,—23l, R; = 183, BR, ="085. 


* In Figure 8 these points have been indicated by R,’” to distinguish them from the correlation 
coefficients of residuals after removal of linear sessional change, there denoted by R;”. 
+ The values of R,” calculated from equation (lvi) and of R,” and S;,” calculated on the assumption 
of the constancy of G; are given in the 8th, 7th and 6th columns of Table XVIII. 
R.! Rs 


t+ r='72 is the value of the mean of the ratios RY? R, 
1 2 


and fu , and using this value for r, 
3 
q was taken as ‘47 by rough trial. 
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Experiment C. 

At the end of the section dealing with the reduction of the observations 
for this experiment the conclusion reached was that R,.’ and R,; were not 
significantly negative; no difficulty therefore arises in fitting a curve of the form 
y=q" to the values of R,’ given in the 6th row of Table XIV, p. 80. This 
was effected by the method of least squares, with the result 

Ry = 6673 x CIOL) ee (Ivii). 

In the 7th row of Table XIV are given the values of R,/ calculated from this 

equation, and in the 8th row the differences 
(R;,’ from observations) — (R,, from curve). 

If these differences are compared with the probable errors of Ry’, it will be 
seen that the fit 1s very satisfactory, for the later calculated values of R,’ are in 
any case uncertain; R,.’ and R,,’ were indeed not used in the least square 
solution as they were known to have too high negative values. 

Haperiment D. , 

On p. 85 it was suggested that Ry, would approach the value +354 as k 
increased. In this case a curve of the form 

R,, = 354 + qr*, 
was fitted to the calculated values of Ry. The fitting was carried out by moments. 
Making R, — 354 = z, we have 


s ys ; ; 
X(z) =@qr ae = N, say, where s is the number of ordinates, or 12a 
1 aaa 
S (ch) = (r + 27? + 87? +... +879) = W x py’ | 
1 
P il srs an 
whence Hy = 7 Pp eee (lvii1), 


and is the distance of the mean from an origin at unit distance from the first 
ordinate qr, 


The constants 4,’ and WV are known; solving (lviii) by approximation we have r, 
and then (lix) gives q. 


The values are q ='1153 
r ='8121) ’ 
and finally, Ry = ‘354-4 “1153: C812) ei tee ee pa db:<)) 


Then using the approximate relation 


1 
Si? oF > (D, — d,) 
R,y = (Ry, — 354) x —- 
Ss? 


—which is a modified form of Equation (xliui)—we obtain for R;’ the equation 
Ry = 1785 CRUDE scence ee (Ixi). 
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Both of the curves, represented by equation (1x) and (1xi), have been drawn in 
Figure 20, and show a satisfactory fit, if the roughness of the data is taken into 
account, 


The results of the Trisection and of the Ten-second Counting Experiments, and 
as far as the rough form of the data will allow, of the Ten-second Estimating 
Experiment, suggest therefore that there is some foundation for the theory of 
relationship between successive estimates put forward at the beginning of the 
present Section. To reach the expression qgr* for the correlation of successive 
judgments at intervals of , it has been necessary in all cases to remove the 
secular change, and in one case a sessional change as well, but if these changes 
correspond in themselves to some definite mental or physical processes which can 
be separated in some degree from the causes underlying the residual variations, 
then we are justified in inquiring into the significance of the constants g and 1. 
It has been suggested that 


9 


ae 
a? + 3° 
so that g is dependent on the ratio between the correlated and the uncorrelated 


parts of the observer’s judgment, that is between what I have considered as the 
true estimate and the accidental errors superimposed in the process of record, 


Eee ates aera ee aide te aoe bene (LiL)3 


> 


Now using (Ixii) and the relation * 


ee a ey eee ae Meustihert (Ixiii), 


(or S” for the Trisections where it has been necessary to allow for a sessional 
change), we find that : iti 
Vat=Vq8', VB H=V(1—g)S! cccceceeeecceeneeeeee(IXiv), 


and the values calculated in this way for /@ and JB? are given in Table XIX. 


TABLE XIX. 
Experiment a] | 1S), | Vaz 82 ? 
po eee ai 7 Tia aa alae aiaieaat= 7 | - 7 | 
Trisection oe 4 ne 80 ‘080 (= 8”) in inches | ‘071 036 ae | 
Bisection (approximate only)... | °47 045 in inches | ‘O31 033 12 | 
Ten-second Counting ... see i) Ui ‘034 in factor 028 7020 | -79 
Ten-second Estimating a 18 ‘141 in factor 060 128 81 


If the Trisection and Bisection results are compared it will be seen that the 


standard deviations of the accidental errors (/ 8?) are nearly the same but that 
there is a large difference between the measures of the variations of the true 


* Tt will be seen that owing to a sessional change in standard deviation, 5,” for the Trisections 
(Table XVIII) and S;’ for the Bisections (Table XI) increase with k. To obtain an approximate value 
for the standard deviation of the whole 1200 observations as opposed to that for the 1000 observations 
of any particular Group i, I have used in equations (Ixiii) and (Ixiv) S’ (or S”’) given by 

Sa (Sy/2 + So’? + S92 +... + 8442). 
Biometrika x1v 7 
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estimates (Je). This is a result which we should anticipate, for the method of 
recording the estimate was the same in each experiment, and accidental errors of 
the same magnitude would occur in both cases; on the other hand the observer 
was faced with a more difficult problem in estimating a third than in estimating 
a half, and this is shown by the greater variability of his estimate in the former 
case (‘07 against ‘03)*. For the Timing experiments, we find no correspondence 
between the V B?’s; the great difference between the counting of ten seconds and 
the attempted concentration of mind on the passing of an unbroken ten second 
interval has been emphasized in the description of the experiments above, and a 
correspondence was hardly to be expected. The standard deviations are in terms 
of the factors e/p and must be multiplied by 10:2 if required in seconds. 


If now we turn to the values of 7 given in the last column of Table XIX, it 
will be seen that they le near together, and although that for the Bisections 
is not an exact measure, there is a suggestion of close agreement between the 7’s 
in the pairs of similar experiments, for we have estimations of length with ‘71 
and ‘72, and estimations of time with ‘79 and ‘81. This coefficient is a measure of 
the rate at which the correlation of successive judgments falls off or the influence 
of previous estimates vanishes from the observer’s mind: on the theory of zero 
partial correlation it is simply the coefticient of correlation between a true estimate 
freed from accidental errors and the preceding estimate. 


On any theory 7 would seem to be a fundamental constant not varying greatly 
for different types of observations, but perhaps varying considerably for different, 
observers. The fact that it is so nearly the same for experiments with a five second 
interval between observations (Trisection and Bisection) and for others with an 
interval of ten seconds or more (Counting and Estimating) shows that the corre- 
lation of successive judgments is a function not only of the teme interval between 
two judgments but also of the number of intervening judgments. For if it were 
purely a function of the time interval we should expect to find a greater differ- 
ence between the values of 7 found for experiments with a five second interval and 
a ten second interval. Indeed if the experiments were exactly the same but for 
difference in interval, R,’ for that with ten seconds would equal R,’ for that with 
five seconds. Further experiments of the same type in which the interval between 
the recording of judgments was varied would undoubtedly throw much light on 
this point. 


XII. PREDICTION. 


If the values of “m” successive judgments are known and there is no corre- 
. fo) 
lation between them, the “most probable” value of the (m+ 1)th judgment, that is 
) = ’ 
the most reasonable guess at its value that can be made, is the mean of the “m” 
judgments. If however the successive judgments are correlated, then it is possible 
to predict the value of the (m-+1)th with much greater expectation of accuracy. 


* This may be compared with the ratio of 3 to 2 given on p. 73 from a comparison of the 
S,’s before making any allowance for the accidental errors. 
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In the Experiments B, C and D it has been found that the correlation between 
judgments at intervals of /, made in the same session, can be expressed approxi- 


mately in the form 
Be OU ene eee nines ete feamaw IRV 


while for Experiment A, owing to the large sessional change, the expression was 
B= pe Gr" vs .aeee aes as eer a db aca 


The decrease of correlation in geometrical progression expressed by (Ixv) 
follows precisely the law of ancestral heredity, for which the multiple regression 
equations required for prediction have already been worked out*. It is not 
therefore proposed to go further into the problem in the present Paper, nor to 
inquire whether the general multiple regression equations would reduce to as 
simple a form when the correlation is expressed by equation (Ixvi) rather 
than (xv). 


XIII. SumMMARY AND CONCLUSIONS. 


The secular change in personal equation is shown by the variation in the series’ 
means, but it is only in Experiment A and perhaps Experiment C, where the general 
trend of the variations is markedly in one direction, that we find that type of 
change which is usually understood when a secular change is referred to. In the 
Bisection Experiment B the linear secular change is very small and its existence 
might well not be recognized, and yet the series’ means are subject to fluctuations 
far exceeding those of random sampling. For the probable error of the mean of a 
series (or of the observations in Group 1) is 

+ 67449 x sn = + 00416, 
V50 
but if we take the distribution consisting of the 20 series means, d,, we find that 
the standard deviation is 037375, giving for the probable error of a mean d, 


+ 02521, 


which is more than six times as large as the probable error we have calculated by 
considering the variations within a series. It is therefore clear that the 50 
observations in a series are not random samples of the whole “universe” of 
observations, as they should be on the Gaussian hypothesis of normal errors. 


It is again only in Experiment A that there is a fairly consistent sessional 
change from series to series which an observer might easily recognize and possibly 
allow for, and yet if we turn to any of the graphs for the Bisection or Seconds- 
counting which show the variations of judgment within a series (Figures 11 and 15), 
it will be seen how very often the mean of ten consecutive judgments will give but 
a poor approximation to the mean of the series; we cannot take the judgments 
within one series as scattered at random. When dealing with a sample of m 


* The Galton-Pearson Law of Ancestral Heredity; the offspring and the mean of the kth grand- 


parents have q7* for their correlation. 
7—2 
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correlated variates, the usual expression for the probable error of the mean is 


(1) + °67449 rm Fn, XS compared with (2) + °67449 = when the variates 
m m 


are not correlated, but owing to the sessional variations to which a large part of 
the correlation is due, the expression (1) being the smaller, is in the present case 
a worse measure than (2), of the probable limits of divergence of the mean of the 
sample from the mean of the series. The graphs of Figures 6, 11 and 15 show 
that there is a tendency for the judgments to vary in waves, to be first on one 
side of the mean for the series, and then to change to the other, but with no 
definite period of variation. It is owing to these large correlated variations which 
cannot be expressed in any simple sessional term, that the coefficients of corre- 
lation, 7,,,4,, between o, and p, have been found to have positive values ranging 
from +°52+4°'11 in Experiment A to +'18+4°15 in D, showing that greater 
variation is associated with higher correlation of successive judgments. 


An analysis has suggested that the coefficients of correlation of the crude values 
of the observations at intervals of / can be lees in the generalized form 


Sy Shy” Ry + Fy ee PN 20s =a Sar deo) 
Ree aa é a 
ve {s," 4241 3(D,-ay | Tsar + Guu? +- ¥ Dresden) 


me 


where 


> (D, = dy) (Dyas — Tiy1), & (Di — di,’ ete. are terms representing the secular change, 
m 


F,, and Gy, are functions of the sessional change, and 


R,” and S;,” are the correlation coefficients and standard deviations of the residuals 
left after secular and sessional changes have been removed. 


In two experiments it has been found that R, is greater than +°80, which shows 
clearly that the estimates have not been distributed randomly in time. 


The coefficients R;” appear to fall off in geometrical progression, and to be 
closely represented by expressions of the form q7*, in which q and r are constant 
for any experiment; it has been found that the mtroduction of the quantities F 
and G in equation (Ixvii) in addition to the secular terms, is only necessary if there 
is a significant sessional change which repeats itself in series after series. Thus in 
Experiment C, where there was no such change, R; could be expressed by the 
relation 

qr SS’ kat =(Di— —d,) es — de+1) 


..(Ixvill). 


ie ai eat +2 LS Din daa) 


A tentative interpretation has been given to the results of this analysis. The 
observations in Experiment A suggested that there was some physiological signi- 
ficance in the distinction between the secular and sessional changes, and this was 
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confirmed in Experiment B, where it was found that there was evidence of a linear 
sessional change acting in the opposite direction to the secular change. A discussion 
of the values of the partial correlation coefficients rz,.- (personal equation and order, 
time constant) and r,,,,, (personal equation and time, order constant) suggested that 
if the interval between the successive series were made very short, it might not be 
sufficient to break the effect of the sessional change. The correlated variations 
which have been found to follow the law R,’=q7r*, have been considered as in some 
way separate from and superimposed upon the other more steady changes. Starting 
from the tentative assumption that there is little or no partial correlation between 
the observer’s true estimates at intervals greater than one—that is to say that the 
observer's judgment at any moment is only influenced by the judgment immediately 
preceding, and only through this and not directly by the earlier judgments—it has 
been shown that the constant q in the relation 


Ry — One ere rere Uae -«.(15¥) bis 


can be accounted for by the presence of uncorrelated accidental errors which are 
superimposed on the correlated variations in the observer's true estimate. Without 
further investigation 1t would be difficult to distinguish between what may perhaps 
be termed the physiological and the psychological factors ; in the experiments that 
have been undertaken the variations in recorded judgment depend partly on the 
movements of the hand, so that the former factors are likely to have played some 
part as well as the latter. The successive recording motions of the hand may have 
been correlated as well as the variations in mental estimate. 

The importance of the results of course depends on how far they may be con- 
sidered as typical of any practical series of observations made by the astronomer or 
the physicist. Experiments were admittedly chosen in which it was expected that 
the variations in judgment would be large, and for the experienced observer working 
at the type of observation in which he has had much practice, the errors would no 
doubt be smaller, but it seems to me likely that the phenomena which have been 
discussed will be present in the judgments of other observers even if on a smaller 
scale. Experience and accuracy may be gained by practice, but 1t does not follow 
that the correlation between successive judgments will disappear. The secular and 
sessional changes may be small, but if rough comparisons of only the yearly mean 
personal equations of different observers are made, the finer changes, which may 
be of considerable importance in a combination of observations, cannot be recognized. 
The Law of Normal Errors requires but two constants to describe adequately any 
series of observations : 

(1) the mean, 
(2) the standard-deviation, 


while the introduction of a third may be necessary if a gradual secular change in 
personal equation is noticed. But the more generalized Theory of Errors discussed 
in the preceding sections requires more detailed information and a greater number 
of constants to define the character of an observer’s personal equation and variations 
in judgment. We shall require to know how the personal equation and the standard 
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deviation vary both within a session and over long periods of time, and if there is 
any correlation between successive judgments, what is the form of the function w, 
which gives the value of the successive correlation coefficients in the relation 


Ry = (kh) 

It is only by a detailed analysis of the observations themselves or of others 
carried out ad hoc, copying them as closely as possible, that full information on 
these points can be obtained ; but if the possible complexities which may be present 
in the variations of judgment are fully realised, a great deal may be done in prac- 
tical cases by the arrangement of the observations and the combination of the 
results, to eliminate the factors whose magnitude is unknown and to correct for 
others which are more easy to ascertain. 


I have heartily to thank Miss I. MeLearn for making the diagrams for Figs. 3, 
4, 8, 12, 17, 19 and 20, and Miss M. Noel Karn for assistance in some of the 
computation. 
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Figs. I—IX. Pelorism of various intensities. Fig. X. Split corolla. 


INHERITANCE IN THE FOXGLOVE, AND THE RESULT 
OF SELECTIVE BREEDING. 


By ERNEST WARREN, D.Sc. Lond. 


In Biometrika, Vol. Xt. pp. 802—327, 1917, the author published a preliminary 
report on the earlier results obtained in the breeding of foxgloves ; and the present 
paper contains some account of the final results of the selection experiments. 


In 1914 ten foxglove plants (Digitalis gloaimaeflora), obtained from various 
sources and of different characteristics, were crossed among themselves and also 
self-fertilised. In subsequent years, 1915—19, new generations were obtained 
chiefly by the self-fertilisation of selected parents. The measurement, or when 
not possible the grading, of certain characters (pelorism, colour, size of flower, 
spotting of flower, etc.) was undertaken in all the generations in order to deter- 
mine the effect of selection when selfing alone occurred in an apparently pure 
race. 

1. PELORISM. 


Mendelian inheritance occurred in a typical fashion. A peloric plant crossed 
with a non-peloric plant produced non-peloric offspring. On selfing these, or 
crossing them together, there resulted on the average one peloric to three non- 
pelorics. 

Of the 10 parent plants two exhibited the peloric condition in a fully developed 
form, and the rest were non-peloric. The character was very perfectly recessive, 
and by breeding, it was found that three of the remaining plants were really 
heterozygous, while all the others were non-peloric and homozygous. 


It was soon observed that the peloric condition was by no means a clearly 
defined and fixed character. Pelorism in the foxglove may be regarded as an 
abnormal lack of power to produce internodes between the flower-buds, and con- 
sequently there may result considerable fusion of such buds with one another. 


The maximum stage of pelorism is seen when the main-axis is short and 
abruptly ceases to grow in height. Only two or three normal flowers may be 
produced by the axis, and its blunt, sharply truncated end is surrounded by a 
whorl of bracts or sepals, petals being absent. Sometimes a ring of sessile anthers 


occurs (PI. I, figs. 1, 11). 


In typical pelorism the inability to produce internodes affects the terminal 
portions of all of the flower-axes of a plant, both central and side-axes. A variable 
number of flower-buds fuse and the corollas unite and may form a large sym- 
metrical cup or saucer of some ornamental value, but the sepals mostly remain 
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separate (figs. 11, Iv). When numerous flower-buds fuse a dense rosette may be 
formed by the petals, and the result is not pleasing. The peloric or crown-flower 
opens early, often before any of the normal flowers. After the crown-flower has 
faded, the main-axis usually grows through the centre of it, and may even produce - 
a second crown-flower (fig. v1); but in the case of the side-shoots the axis generally 
ends in an ovary and no further growth occurs (fig. V). 

If the peloric tendency is not so well-marked, the main-axis may be only 
slightly affected by the suppression of several internodes, and by the partial 
fusion of flower-buds, at a variable distance above the lowest normal flower of the 
axis. Sometimes a considerable number of internodes may be unduly shortened, 
so as to produce excessive crowding of flowers which do not actually fuse (fig. V1), 
and frequently a strongly marked spiral bending of the axis occurs (fig. VIII). 

At other times the suppression of the internodes may occur only high up on 
the flowering axis close to where it normally ceases to grow (fig. IX). 

When the central axis is strongly peloric the side-axes are invariably so, and in 
all other cases the side-axes exhibit greater pelorism than the main-axis. 

Finally, the main-axis may be quite normal and show no peloric tendency, but 
the side-axes may still be strongly peloric. 

The last trace of pelorism in a plant is shown when only one or two of the 
weaker side-axes exhibit some slight sign of a peloric tendency. 

It is unfortunate that it has not been found possible to devise any practical 
method of measuring the intensity of pelorism, and therefore the plants have been 
arranged in four grades. 

0° grade = no peloric tendency. 
1°— 25° grade = those in which the central axis is non-peloric, but the side- 
axes exhibit some peloric tendency. 

26°-— 50° grade = main-axis non-peloric, but side-axes may reach full pelorism. 

51°— 75° grade = main-axis partially peloric, side-axes fully so. 

76°—100° grade = plants ranging to complete pelorism in all axes. 

In the generations produced from 1914—19 there were in all 128 fertilisa- 
tions of different classes of individuals, recessive (peloric), homozygous dominant 
(non-peloric) and heterozygous dominant (non-peloric) plants, and families were 
raised. In the table on p. 105 the experimental and theoretical results are 
compared. The fertilisations of the classes DD x DD, RR x RR, and DR x DR 
include both selfing and crossing. The sum totals of the experimental and 
theoretical results are remarkably close; being, crowned, 1019 experimental and 
1013 theoretical ; non-crowned, 1169 experimental and 1175 theoretical. 

It must be noted here that a plant was recorded as “ peloric” or “ crowned” if 
it exhibited the least tendency towards pelorism in any of the axes. Taking all 
the classes or groups together it may be said that the inheritance of the quality of 
pelorism is typically Mendelian. The group kA x RR should include no non- 
crowned offspring, and the 7 which occurred were obtained by gradual selection. 
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The group in which the experimental result diverged the most widely from the 

theoretical result was DR x RR (heterozygous plants crossed with recessives) and 
it would be interesting to know whether such is generally the case in Mendelian 
inheritance. 


Number of Crowned Number of Non-Crowned 
Gametic Nature | Number | Number Offspring Offspring 
ot OL om —— | 
Eairings Bamilies/Odspring Experimental | ‘Theoretical Experimental | Theoretical 
| = aa = r= ——- nah | 
DDxDD | 16 266 0 o | 266 266 | 
RRx RR 43 741 734 741 0 O 
DRx DR 38 777 187 194 590 583, | 
DRx DD 5 93 0 9 93 93 
DRx RR 12 156 98 | 7 58 | 78 
DDxRR 14 155 - O O 155 | 155 
Totals | 128 2188 1019 1013 1169 | 1175 


The Inheritance of the Degree or Intensity of Pelorism. 

If a peloric plant be crossed with a non-peloric homozygous dominant, the 
offspring are heterozygous and non-peloric, and if these are self-fertilised or crossed 
together the peloric character re-appears in an apparently unchanged and un- 
diluted condition. If, on the other hand, a strongly peloric plant is crossed with 
a weakly peloric one the offspring are more or less intermediate, and if the 
offspring are selfed or fertilised together the intermediate nature of the peloric 
character tends to be retained. 

In the accompanying table A, B, C, D, # are plants of various gametic con- 
stitution. On selfing (A) the offspring were all fully peloric. Ov selfing some 
5 offspring, A, 2—9, the plants produced were all essentially fully crowned. 

On crossing two recessive plants (A and /) of different peloric intensities (see 
bottom of table) the offspring tended to be intermediate. 

On crossing (A) with an ordinary plant (B) the offspring were non-peloric and 
heterozygous. On selfing two of these plants, (A x B) pls. 2 and 7, the offspring 
were either fully peloric, or non-peloric (heterozygous and homozygous). On 
selfing two recessives, (A x B) 2, pls. 8 and 9, obtained from (A x B) pl. 2, the 
offspring were all nearly completely peloric. Thus, there was no clearly marked 
dilution or apparent contamination by crossing a peloric plant with a non-peloric 
one. When, however, the same recessive plant (A) was crossed with a hetero- 
zygous plant (C) having in its gametes a weak peloric tendency of about 35° 
there was much variation in the offspring, and on selfing some of these plants, 
(A x C) 1, 2, 7, 11, and raising a new generation it was obvious that considerable 
dilution of the peloric tendency had occurred. On crossing the same plant (A) 
with a heterozygous plant (D) having a stronger peloric tendency (75°) in its 
gametes it was clear that in the next generation raised (A x D) 6, 5, 11 less 
dilution had taken place than in the former case. 


106 Inheritance in the Foaglove 
Pelorism— Various Pairings. 
ra - - , he oO 
Peloric Offspring 6 Peloric Offspring ‘2 
o = 
Parentage - = Offspring (selfed) a 
100°} 75° | 50°| 25°} & : 100°] 75°] 50° | 25° | 
A (100° pelorism) | 33) 0} O| O 0) A pl. 2 (100° pelorism) 13} 1 | 01} 0 0 
Selfed=LARx RR A pl. 3 5 3) |) Ol SOMmOk 0 
Apl. 4 . PaO PO. | 0 | oO 
A pl. 6 * 6 |) 07/07 0 0 
A pl. 9 ef OOO! oO 
A? (100° pelorism) | (A x B) pl. 2 (non-peloric and; 6 | 0 | O | O | 21 
x 0} O} O|} O | 13] heterozygous) 
B ¢ (0° pelorism and (A x B) pl. 7 (non-peloric and} 5 | 0 | O | O | 23 
homozygous) heterozygous) 
Ax B=RkRx DD (A x B) 2 pl. 8 (100° pelorism) | 12 | 1 | 0 | O 0 
(Ax B) 2 pl. 9 - mG Oi Of © 
C (heterozygous) 
Selfed= Dk x DR E00) 25 e35 ers 
A @ (100° pelorisi) (Ax @) pl. 1 (75° pelorism) | 20] 4 |] 9 | 2 0 
x 4/12} 3] 1) 7] (AxO) pl. 2 (50° pelorism) Cals elt 1 (0) 
C g (non-crowned and | (A x C) pl. 7 (heterozygous) | — | — | 2 | O | 10 
heterozygous with, | (Ax C) pl. 11 4 —/|2]3)1 7 
say, 35° pelorism in 
gamete) 
AxC=RRxDR 
D(heterozygous)selfed) 1 | 1 | 1 | O 8 
A 2 (100° pelorism) | (A x D) pl. 6 (100° pelorism) | 17 | 2 | 2 | O 0 
x | 2} 0] 0 | 10] (AxD)pl.s5 a8 2) | O 1 6 ale Onleno 
D g (non-crowned and | (A x D) pl. 11 (75° pelorism) | 27} 5 | 1 | O 0 
heterozygous with, | 
say, 75° pelorism in 
gamete) 
AG (100) «2 (505) 4 3 fon 0 


In the last generation it will be seen that there was no sharp separation of the 
plants into two groups attributable to the two grandparental factors. Thus, in 


the case of (A x C) pl. 2 (50°) the offspring are not clearly divisible into those of 


RO 
35 


100° resembling A, and those of attributable to C; in other words there was 
no obvious segregation into two degrees of pelorism. 


On the factorial and chromosome hypotheses we must suppose that the factor 
or factors governing the peloric character tend to become mutually changed and 
intermediate in nature when the male and female chromosomes containing the 
factors for the two degrees of pelorism lie alongside each other in the zygote. 


It will be of interest to obtain a general measure of the strength of inheritance 
between mid-parent and offspring with respect to the transmission of the degree 
or intensity of pelorism. For this purpose only recessives were used, involving 
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30 mid-parents. Employing Prof. Karl Pearson’s method the accompanying table 
gives the correlation surface. 
Pelorism—Correlation Table—Recessives. Mid-parent and Offspring. 
Offspring. Grade of Pelorism. 


x to 
Mid-pavents. 3 = S "0 

Grade of L | Totals 
Pelorism S 3 3 & 

il —. War 6 2 23 18 49 
263 — Os 58 61 68 11 198 
51°— 75° 64 31 Lb — 110 
76°—100° 143 14 11 5 173 

| 
Totals 271 108 117 34 530 


The coefficient of correlation, calculated from the table, between mid-parent 
and offspring is 52. The result can be regarded as only a very rough approxi- 
mation, since a satisfactory method of measuring pelorism has yet to be found. 
The figure obtained is somewhat low, but it would seem to indicate that the in- 
heritance of the degree of pelorism is of the nature of ordinary blended inheritance. 

The point of interest to notice is that the union of two pelorie plants of 
different peloric intensities influences the gametes, while the union of a peloric 
plant with a homozygous non-peloric plant does not very readily affect the purity 
of the gametes with respect to pelorism. 

Pelorism. Effect of Selection in a homogeneous race. 

A peloric plant (C) with pelorism of about 85° intensity was self-fertilised, and 
the offspring, 16 in number, were as follows: 7 with 100°, 4 with 75° and 5 
with 50° of ag oss 


| 


S a | 
Crowned Offspring = | Crowned Offspring | 2 | 
Parentage i Parentage | | 
(Self-fertilisation) Ss) (Self-fertilisation) 5 | 
100°} 75° | 50° | 25° | © 100°} 75° | 50° | 25° | 6 | 
Zz | | A 
C'(85°) 7 4/5 |01; 0 
——— | —_ | + C 2, 11 (75°) | 6 | 13 Dn OM 0 
Meee 1 | [ 
| C2 (50°) ieey 8518 ZH || (KO to) {i ) ae 2, 2 (50°) 2 1610" | 20 0 
1 
L ule |—— LC 2, 8 (50°) ; 1 | 138] 11] 0} 0 
EG AG0:). -.. | 17 LIL | 18 | 2 | 0 | 
i ee 
C7, 10 (25°) -... 0) 2/18 | 2 | 5 
HAs ape oe A ee a =| 
C 7, 10, 20 (25°) 0} O i 7 
L—( 7, 10, 20, 4 (0°) 0 0} O0/| 0 | 6 
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Two of these plants of 50° (C2 and 7) were selfed, and the generation raised 
exhibited a lowered pelorism. The various selections made and the results 
obtained are shown in the accompanying table. It will be seen that finally on 
the selfing of plant C 7, 10, 20, 4 (0°) only non-peloric offspring were obtained. 


2. GENERAL COLORATION OF THE COROLLA. 


As described in the previous report (loc. cat.) the imtensity of the purple 
coloration was measured by comparing it with a colour-scale founded on the 
intensity of colour by transmitted light of varying depths of a standard colour- 
solution. 

Purple and white foxgloves exhibit the ordinary Mendelian relationship, purple 
being dominant. <A confusing aspect of the problem is introduced by the fact 
that “white” foxgloves are not necessarily entirely white, since they may exhibit 
a faint purple coloration which on the colour-scale adopted may amount to 
about 5. On crossing such a plant with an ordinary purple plant segregation 
occurs when the heterozygous offspring are self-fertilised. Any higher coloration, 
say 10—15, does not exhibit segregation, but gives a blended inheritance, and 
such a plant is to be regarded as a very pale purple one and not “white.” From 
certain observations that have been made it is probable that a similar condition 
occurs in the Blue Agapanthus lily, since some of the “ white” plants have flowers 
faintly tinged with blue. It is quite likely that the phenomenon is general, and it 
may throw. an important light on the physical theory of heredity. Possibly it 
may be surmised that a factor for a coloration of less than 5 units is unable 
to blend with, or influence, the factor controlling a higher coloration, in that we 
have reached the lowest dynamic unit. 

Of the ten original plants, five were purple and homozygous, four were purple 
and heterozygous and one was white or recessive. These were very variously 
crossed in all manner of ways. In the accompanying table the experimental 
results are compared with the Mendelian expectation for the different gametic 


pairings. 
General Coloration of Corolla—Breeding Results. 
| Gametic Nature | Number | Number White Purple 
| of of _ of. nea ee 
Paige SE SUBUIES Os) 2a Experimental! Expectation | Experimental) Expectation 
DD x DD 120 1620 2+3 0) 1615 1620 
RRx RR 17 336 330 336 6 0) 
DRx DR 50 785 190 196 595 589 
DRx DD 11 103 O 0 103 103 
DRxKRR 8 76 24 38 52 38 
DDx RR 8 87 O 0 87 87 
Totals 214 3007 549 570 2458 2437 
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In the gametic group DD x DD (homozygous purple x homozygous purple) 
there were 1620 offspring. These should have been all purple, but there were two 
white plants which occurred in two deeply coloured families and three white 
plants which occurred in one pale-coloured family. I do not believe that there 
was contamination, and it 1s probable that the two former plants were sports, 
while the three latter plants were produced by selection. 


In the group RR x RR (white x white) there were 336 offspring, and these 
should have been all white, but there were six pale-coloured plants. The difficulty 
in distinguishing a tinged “ white” plant from a pale-coloured plant may account 
for this result, but I favour the view that we are here witnessing the beginning of 
a coloured race. 


The result given by DR x DR (heterozygous purple x heterozygous purple) is 
very closely Mendelian. Out of 785 offspring there were 190 white plants while 
the expectation was 196. 


Heterozygous plants crossed with dominants (DR x DD) gave nothing but 
coloured plants, and this was also the case with dominants crossed with recessives 


(DD x RR). 


The gametic group DR x RR (heterozygous plants x recessives) gave a result 
which diverged rather widely from the expectation: there were insufficient whites, 
there being 24 whites and 52 purples instead of 38 of each. The numbers are 
somewhat small for drawing conclusions, but it is important to notice that in the 
character of pelorism it was the same gametie group which diverged the most 
widely of all the classes from the theoretical expectation. On the chromosome 
hypothesis it may be conjectured that possibly preferential pairing of the male 
and female chromosomes may explain the discrepancy. 


The Inheritance of the Intensity of Coloration. 

On crossing a purple homozygous plant with a white plant the offspring were 
all heterozygous and all coloured, but the intensity of the coloration was mostly 
reduced very considerably. On selfing these offspring the next generation yielded 
some homozygous dominants in which the original colour-intensity of the grand- 
parent was regained; thus, at first sight it appeared that there had been no real 
dilution of the colour by crossing with the white. This was my first impression 
from the earlier results, but with more extended experience I found that there 
was certain evidence that the crossing with the white did have some deleterious 
action on the intensity of the coloration of the dominant grandchildren, although 
the coloration which appeared was much greater than a half and half blend with 
white. 


If two homozygous dominants of marked difference in colour-intensity were 
crossed, the offspring tended to be intermediate. On selfing these offspring the 
next generation was similarly intermediate, and there was no segregation into the 
two different mtensities of the grandparents. Thus a true blend of the two 
intensities had taken place. 


110 Inheritance in the Foxglove 


In the accompanying table the results of some instructive crossings and self- 
fertilisations are given. In Series I two dominants (# and F’) of different colour- | 
intensities were selfed and the families raised showed that the parents were homo- 
zygous. On crossing (/) and (F’) a family of intermediate offspring was obtained. 


Colowr-Intensity— Various Pairr ings. 


: “= : Colour- Seale Onseene 
Goloured tplants ‘© WHITE ”’ 
No. of Gametic . Mid- ea j al Mean of 
Seri eons rente arental} > | n | oo Coloured 
Series ETA Parentage Colour r I : S | ® : =H S Cs S Offspring 
Ps ale 1 P| | 
oo | RO | |) con] =a) | sept) Gesuiifrcon ime 
SO) Wa Seal Pare Saba eS | ee | SS u 
| ee? sn Te a 2 ws ee ieee (ee - 

I DD x DD | E(selfed) ... ost bes 99 — || 2. |, 40 5 8 os a a Soa 83 

DDxDD\F oe aa ans 68 —|—|—|—]} 6 3 as a fh ae 64 

DDx DD \ QEx 6 F Hoot pee —}—|—] 4] 38} 5) —f-—]— an 71 

DDx DD (Ex oa pl. 16 (selfed) Ee 82 — 2 6/ 2)/—|]— wat = 87 

DDx DD | (Ex F) pl. 9 eee) Pe 61 —;—|)—|]—]} 1] 10} 1} —j|— ea 56 

i |DDx RR | 9 Ex 3 (Wate) rae 2 ec Tl | ale = 

DRxDR | EX Yate ele 18 (selfed) 71 —j|—/1 By PIN | ott tae = 2 66 

I i . DDx DD | B (selfed)... : ost 95 By) Oy By eral es. = |] a ram 102 

DDxRR|GBxe (Wuirtr) AZ —|—-|—] 1 Ae le || ah a] — = 59 

DRx DR | (B x Wurte) pl. 1 (selfed) 80 SS ees I Pes | | — S 64 

DRx DR | (Bx Waite) pl. 5 (selfed) 32 — |—}|—}| —|—]—/| 2 aan 5 3] 

| vf. DD x DD Belcan are Beis 95 3 0 2 3 6 1 4 102 

DRxDR | i Soae ie ae Seen 0 — | — | — | 4) 129.4 1) 1 Se 7 70 

DDxDR|GBxdA 3 82 PA) pl ANS eA, | 105 

DDx DD | (Bx A) pl. 7 (selfed) on 30 1 1 | 38 | 12) 13) —}—j|—|— as 87 

DDxDD | (Bx A) pl. 2 a (ecliow) m3 65 Fa ag 2/13/12 )—j,—]— = 67 

va DDx DD B (selfed)... sista ae | 95 3 10) 2 3 6 i — | — | — ore 102 

DRx DR | C (selfed) .. 3 | 34 | | are | | 4 40 

DDx DR ° Bx so aBG 65 == |=) = | 2 AVN Oe Nea. |) me ae 53 

DDx DD | (Bx) pl. 8 (selfed) ae 50 Se a ee Pee | a 50 

DDxDD pl. 4 ate tacks 50 — == — —_ 4 Lei 33 a — 59 

DDx DD pl. 7 ae ue 58 ee 5 Aaa Ta a 60" 

DDx DD pl. 6 : oe 68 Wow sel ee) Oe Oe | = as 5A 

DDx DD pl. 1 er és 70 sa eee cca eS ee ee eS) 74 


Two of these offspring were selected, (Hx F) pls. 16 and 9, as widely divergent | 
from each other as possible, and selfed. In the families obtained there was no 
tendency for the occurrence of segregation into the two colour-intensities of (/) 
and (F) respectively. There was thus a definite blend, and the means of the two 
families approached the respective colour-intensities of the two self-fertilised | 
plants. 

In Series II the same homozygous dominant plant (/), with colour-intensity of 
90°, was crossed with a white plant and all the offspring were heterozygous and 
intermediate. On selfing one of the darker coloured offspring, no. 18, the dominant 
plants raised tended to be of about the same colour-intensity as the grandparent 
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(FZ). In Series III a dark-coloured homozygous dominant plant (B) was also 
crossed with a white plant. One of the darkest heterozygous offspring (B x White) 
pl. 1 was selfed and the coloured plants raised tended to be paler than the grand- 
parent, but the family was small. 

In Series IV the dark-coloured homozygous plant (B) was crossed with a dark 
heterozygous plant (A). From the offspring raised, two were selected and selfed, 
one very dark and the other moderately dark. The two families included only 
coloured plants, and consequently the parents may be supposed to have been 
homozygous. The moderately dark parent (Bx A) pl. 2 failed to produce any 
offspring as dark as the grandparent (B). 

In Series V the same plant (B) was crossed with a light heterozygous plant 
(C). From the offspring produced five homozygous dominants were selfed, and in 
the five families raised only two plants reached the colour-intensity of the grand- 
parent (B). . 

On taking all these results together it may be said that there is evidence for 
the view that crossing a dark race of foxgloves with white plants tends to dull the 
colour-intensity of homozygous dominants of subsequent generations. 


General Coloration—Strength of Inheritance and Effect of Selection. 


In 1914 a dark-coloured homozygous plant (B,  ) was crossed with a somewhat 
pale-coloured heterozygous plant (C; ~)=DD x DR =TIII. The offspring would 
consist theoretically of approximately equal numbers of dominants and hetero- 
zygous individuals. The reciprocal cross (C, 2 x B,¥) was also made = II. Several 
dominants were selfed and families were raised. Out of these families certain 
plants were selected and selfed and new families were obtained. This procedure 
was continued until 1917, and the results are given in the accompanying table. 
The families of the different years are arranged in ascending order of the colour- 
intensities of the parents. On comparing the means of the families with the 
colour-grade of the parents (shown in brackets) it will be at once seen that small 
variations in the colour-intensity of the parents tended to be transmitted to the 
offspring. It is obvious that the table exhibits the effect of selection in self- 
fertilised homozygous generations. 


For example we may take the following: 


Homozygous plant, IT. 1 had a colour of 70 and a mean of offspring 74 
An offspring of above, IL-1, 4 3 rs 74, ne % 82 
An offspring of above, II. 1, 4, 17 5 .; 110 =, Ns re 95 
Homozygous plant, III. 2 55 me T9 op - ms 66 
An offspring of above, IIT. 2, 1 -. re 66, 5 , 55 
An offspring of above, III. 2, 1, 18 3 a 30) i . 85 
An offspring of above, ITT. 2, 1, 18, 28 ,, es 95, os - 100 
Reverse selection 1s shown also : 

Homozygous plant, IIT. 2 ss 5 Omens 5 66 
An offspring of above, IIT. 2, 5 35 3 O20 es; 5 5 57 
An offspring of above, IIT. 2, 5, 5 ra 5 40 ,, 55 3 41 


An offspring of above, IIT. 2, 5, £ Fe 30 ==; Gs es By 
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Inheritance of Colour-Intensity among Dominants. 


Dominant Generations (Self-fertilisation) 


Thus, starting with a plant of about 70 colour-intensity we arrive by selection 
of self-fertilised plants at mean family intensities of 100 in one direction and 32 in 
the reverse direction. 

In another series, starting with a homozygous dominant plant of colour- 
intensity of about 11, I have by selection obtaimed plants in which the corolla 
exhibited no general tint. On selfing the pale plant no white plants occurred, 


~ ES => — oS | 
Grades of A ell ess |e eel et eel S| |S Bie |S |S | 
| Colour- $ S Sy 3 a re coe eet eee lies R oo est < i S 
Scale a ~~ © | a Say Get alle Monn dl iesigl| ceaean | SORTS nll aie SS 
(Offspring) a aa eae lees = Cael aah satan Meta fara aera | = iat lle 
py 5 at rar mY = = = i= a Op a | oc | ral = | 
tt te Se Veale ast Sseuiest = || Sl m | 
30— 39 — —|—/|—)—}|—|)—}/~)-}J-}|-—};-|-—]-|- 
40— 49 — 2 )—)—)— ~)—) —)—|—-f—) —})—]} —] — |] — 
50— 59 1 | 15 5 il 1 2);—)— 5 1 3} — 1}—}]—]— 
60— 69 e | = ui rh 7 6 3 4 2) 2 4 3 3 1}—)— 
7O— 79 2, || |, — 2 8 1 7 4 5 J — 1 3 O0o;—}]|— 
80— 89 } f—}|—f—m—yorl}—fy—-l 6 3 1yj—|— 2 6 1 | — 
90— 99 Se ee er lO 
100—109 —|/—!—-—};—/;—/]—|!—}] 1 Yo) en ee ero es 
10—119 | | — | —| |=} 9) 0 |= ee ee 
120—129 | }—|—|—)—)— | — | 2) es See ee eee ee 
Means 66 | 54 | 62 | 66 | 69 | 64 | 64 | 82 | 78 | 70 | 59 | 65 | 71 | 81 | 95 | — 
a | pa | | SO. | SS eg ie alan Slo cc: 
ee a~lelalelelele!|S1S\81 81 a2 eee 
Grades of | x 2 iS Ss | a SS SS | RS, eee = o5) = 3 Pe) |i a = = a a ai} A 
Colour | Sis ee st laos | ce ate Lee | | SP is |S ienoee eo ome ce ea) 
Seale Se A ot [ok | ce] oP | ok [or P08 Fe P88 a 7 es ee ae ee 
(Offspring) | PR Oe |e | pe le fH PN TN Wl ale Gy, igen eo See areata 
= =Seo seaman Ga bee tom eames Klan onl | ap al ap) a 
ll = fo ne Pee ee esta em elites th pee rer halt Ga 
= ee eda ee pp 
= al eH cal lal il a 
20— 29 ee ee 
= SOU My me ha eee es eee peat ae ae ae es 
4o—49| O|. = | 3] 3/—| &|—|—1-7) 8|/—)| of — 1) 4) oo ere ee 
50— 59 | 1 4,9) 6] 3).5)=)] 1. —le8) &) ob Ve) 1 aoe ee 
60— 69 0) 17 3 2 1 2 6} 11f—) — 4 2 lyj—|—-—)|;—|— 0) 1 3) — 
70—- 79 2 9 2);—|]— O | 11 9fF—) — 1 4 27—};—)|;—)|;— 0) 1 0) 1 
80— 89 l 1 —|—|— 1 il Oy— —|— 2 3,—)'|—)|—);— 2 il 1 4 
90— 99 | — — Pe | mee fe ec ee eran 
100—109 | — Po J | 2) |) ee oe Se ee ee eee 
110119 | —f- = ff) — |) S| SS) i) Se SS SS eee 
oe Ae pl ees] Pe | |e ee : SS SSS | S| Ol 
130—139 | — |. — J—|—) —| =| fe | a aS Se aa aS ea ae eee 
Means 67 66 57 | 54 | 60 | 55 | 71 | 69 | 41 | 47 | 62 | 68 | 85 | 32 | 43 | 47 | 44 | 74 | 75 | 91 | 100 
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and the offspring were all pale-coloured ; but when the intensity was decreased by 
selection to about 4, the “white” plants showed Mendelian segregation, for the 
offspring arising from the plants produced from a cross with a dark-coloured 
plant were sharply divisible into strongly coloured and “ white” individuals. 

As a further example of selection, I started with a homozygous medium- 
coloured (48) plant (G). This was selfed and a family of 31 coloured plants was 
raised, there were no whites. Thus, the parent plant may be regarded as homo- 
zygous. A plant (G8) in this family, not far removed in colour (55) from the 
average, was selfed and the resulting family had a mean colour approximating to 
the colour of the parent. A light-coloured (27) plant (G 3, 20) and a dark (81) 
plant (G@ 3, 13) in this last family were selfed also, and the two families raised 
tended to resemble their respective parents. In a succeeding generation further 
progress was obtained in securing a dark race and a pale race. The necessary 
details are given in the accompanying diagrammatic table. The families printed 
in heavy type are those leading to a dark race, while those in ordinary type are 
passing into a light race. 


Formation of Lnght and Dark Races from a Dominant 
(homozygous) G. 


Parents Offspring—Seale of Colour 
ol er/rx/ ai ale] wo] we] 
Yt al alt ST ST SU 1 ~1 vt 
Number Colour | | | | | | | | 
S Re) o% ie D MT] & | & vw 
mS by os oS b oS ~~ | > YS 
| 
G (selfed) NG 4S cer eee eo) ates LD Gi | 2 
G pl. 3 one il) 9 Tears} —|—|]}2]1]0 4/5 |3] 1 
GiSaecOteeees Pose 1 t= ee ee ea gee 
Gopis) 2. | 82 | — |) 2-3 |\9 1a) ale |—)—| 
G 3, pl. 20 = 27 ee ae ed 
G3, 13;pl 2 ... SO 2/;6;2);—|;—j;-—/!-—;|;-|- 


Correlation Table—Colour-Intensity—Dominants (homozygous). 


Series II and ITT. 


Parents. areal eel tes | OS tt oan ics ics ons oS 
Grade of = he] = > > > me 
Colour- L | it i | | =. 
Intensity S| eS x = ra 
380— 39 — | —- — = = |) 4 BP 33 i 
ae Oe an | ea er ey o6-1 O05 Tor) == 74 
= 5) \\ ee ce I Sh | Bas 8} |e 100 
6O— 69 | = eS oii | iss |) B47/ |] Bak Mats} 1 103 
70— 79 —\|— 2 2 8 | 19 | 54 | 50 9 O 1) 145 
80— 89 —;—|— 2 5 33 2 L)—)}—|— 17 
90— 99 2 5 9 TP ayy 1h) 5 m 5|—|—]|]— 61 
100—109 — — | — — | — — 
110—119 — 2 (0) — | — — 
120—129 BS Ss 5| 2 — 
Totals 
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In the last table, p. 113, a correlation surface is shown between parents and 
offspring. It is formed from the series of families given in the table preceding the 
last, and arising by self-fertilisation. 


The constants calculated from the table are: standard deviation of weighted 
parents 1°7805 units, and of offspring 18962 units, coefficient of correlation “707. 


In this table 39 families were involved, as detailed in the previous table. 
The starting points were four homozygous dominant plants occurring in the two 
families raised from the reciprocal crosses (C, x B,) and (B, x C)). 


3. Brown Sports. 


The amount of spotting on the inside of the corolla is not closely correlated to 
the intensity of the general purple coloration of the flower, for even in white 
plants the spots may be numerous and of a deep purple colour. In coloured 
plants the spots were almost always dark purple. As a very rare exception in the 
coloured plants (4 plants in about 2590) some of the spots were russet brown, and 
in the case of the larger spots there was a middle area of brown bordered by a 
margin of purple. In white flowers the spots were fairly frequently brownish- 
green or brown. In such brown spotted white flowers I could never detect the 
slightest tinge of purple on the general surface of the corolla, while in purple- 
spotted white flowers a faint tinge of purple could often be seen. The brown 
spots of white flowers might not become visible until the flowers were on the 
point of fading, and in the case of any given white plant it was wholly impossible 
to affirm that brown spots were, or would be, entirely absent from all of the 
flowers. 

With the exception of the four plants mentioned above there was a sharp dis- 
continuity to the naked eye between purple spots and brown spots, intermediate 
conditions being absent. The brown colouring matter may be regarded as altered 
or decomposed anthocyanin. In purple spots a microscopic examination often 
showed a certain amount of decomposition; but, with the exception of the four 
plants, the amount was not enough to alter the colour of the spots sufficiently for 
detection by the naked eye. Thus, the discontinuity lies between a normal small 
amount of decomposition, and an abnormal entire decomposition. It may be 
stated that under ordinary circumstances brown or greenish spots (as seen by the 
naked eye) are linked to a perfectly white corolla, but purple spots occur in both 
purple and “white” flowers, and an apparently perfectly white corolla may also 
bear purple spots. 


If a brown spotted plant is crossed with a purple spotted one the offspring are 
all purple spotted and heterozygous. The brown spotted condition is inherited in 
Mendelian fashion, and is recessive to purple spots. 


No special crossings have been made to investigate the matter, and the results 
which are given below are merely picked out from the records of the numerous 
families which have been raised for other purposes. 
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In the accompanying table it is useless to include families in which there was 
no taint of whiteness, since all the individuals (except 4 plants out of 2500) had 
purple spots. 


Brown Spots—Families White or Some Taint of Whiteness. 


: Purple Spotted Brown Spotted 
Gametic Nature} Number | Number | 
of of Ole = OHSS GS 
Pairings Families | Offspring | _, : .| Mendelian : Mendelian 
Experimentai | Expectation Experimental Expectation 
DDxDD | 13 344 344 344 O O 
RRxRR 11 169 0) | 0) 169 169 
DRx DR 13 213 166 160 7 53 
DRx DD 15 137 137 137 0) 0) 
DRxRR 1 8 3 | 4 5 4 
DDxRR 6 70 |} 70 | 70 O 0) 
Totals 59 941 720 715 221 226 


It is obvious from the table that the brown spotted condition exhibits Men- 
delian inheritance. 


4. INHERITANCE OF CERTAIN SPORT ABNORMALITIES. 


Crenate Margin.—In a homogeneous family of 29 plants there appeared one 
plant in which the free edge of the mouth of the flower exhibited a well-marked 
serrated condition. All the flowers of a main-axis of considerable size were. 
similarly affected, and later, lateral flowering axes were formed, and the flowers 
were also serrate. The character was sufficiently marked to be noticeable at a 
casual glance of the plant, and since all the numerous flowers were alike in this 
particular, the character was clearly inherent in the plant, and was not due to a 
chance environmental disturbance influencing a young growing axis or certain 
flower-buds. The plant was self-fertilised, and it was confidently expected that 
the character would reappear in the offspring. Out of a fainily of some 20 plants 
12 flowered and no sign of the peculiar serrated condition could be detected in 
any one of the plants. Here we have a conspicuous character in a large healthy 
plant affecting every flower of all the flowering axes, and yet apparently it was 
incapable of being transinitted to the offspring. 


Split Corolla.—In a homogeneous family (XXXIV) of 27 plants there appeared 
one plant in which in the great majority of the numerous flowers the corolla was 
symmetrically divided into an upper, a lower and two lateral pieces by four lateral 
splits extending down to the base of the flower. The plant was a large, healthy 
one and produced a number of similar lateral axes. At least 90°/, of the flowers 
were completely split (PI. I, fig. 10). 

In a family (VIII 7) unrelated to the above there were 16 plants, and of 
these, four plants were similarly affected. In one of these plants practically all 
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(99°/.) of the flowers were entirely split into four pieces, while in the remaining 
three plants some 50—60 °/, of the flowers were split. All the plants were large 
and vigorous. It was thought that very probably the character would exhibit 
Mendelian inheritance. The results of crossing and selfing are shown in the 
accompanying table. 

Inheritance of Split Corolla. 


3 
go @ 
aos SMe oslo ilmackalnce: memiliae late evel ter ils || S | S& |. .o 
Qa fon) ol ol (or) sH 
eee 
= 
i - 
a8 pC 
es M 
s ; Q <2 
eI E = S 
S q ~ Noe 
(= 5 H 4 
A = : 
Lo} ee — — ize a 
oO a so pa re} eS Ge) 
3 x = Sal sare S) Zoe 
Z > SlelSlelelsis eke |e Ss 
2 @ialel2lelf£l—|£lf leet 
=e a1218|/8|218)S )= 1S |B So) aes eee 
| cre) eles psa SSG [ete ete Pca let. || oo ge) jf 
CAN alee neice cree bo |S |= | 5 
SMEG Satan Site is (Ss) S|) 8 
Mie | KX lai lala la lo la | ole | le le] 
o | Eyes 
% | No Splitting | 961/12/ 8 | 7 | 2) 3/1) 0 | 0 | 9 | 96] 12 | 15) 10) 10 
2 4 1 0/0} 6) 38.14 121 19720" | COs OO om Om mar 
as |  15—29 01:0 | 3! O }-0.).0 ) Onl0 (0) 0) OF OR Om Om eo 
ene 30—I4 O |: 0 | O } O°} Bed | 2512 10.490" | 202) 20m orm Om mee 
Bes 45—59 Oo} 1} 0 |-2 | 4) b40 1 04)°0 | 0 FO, | On om tom mo 
a. 60—74 O-| 2°) 0.) Te) e338.) 2] 10-0 moa Oe On mo 
5 75—99 1} 1) 0 | 2) 5) 47) 1) Dal 320. Om Om OM xO 


The first mentioned plant (XXXIV 4) with 90°/, of the flowers split was 
crossed with an unrelated plant with some 99 °/, of the flowers split (5th vertical 
column of table). Of the 17 offspring 8 plants were wholly unsplit, while the 
remainder exhibited the character in a very greatly weakened condition. Three 
of these offspring, S. J. nos. 9, 18 and 6 having 0°/,, 13°/, and 18°/, of the 
flowers split respectively, were selfed, and the families raised all contained some 
plants very conspicuously split, but the character was more marked in the two 
families raised from parents 18 and 6 which showed some degree of splitting. In 
a subsequent generation (S. J. 18 pl. 4 and S. J. 18 pl. 10) raised by selfing, the 
character became very strongly pronounced. 


An unrelated non-split plant (II 6, 1) was crossed with the first mentioned 
plant having at least 90°/, of the flowers split (XXXIV 4). In the family of 
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12 plants raised none of the plants exhibited splitting. Two of these offspring 
(R. J. nos. 9, 16) were selfed and no splitting occurred in the two families. 
Another generation was raised from R. J. 16, plant 14 and some re-appearance of 
splitting was detected. The table includes all the split plants which have occurred 
among some 3000 plants which have been under observation. 


The results obtained indicate that heredity has some influence, but the data 
are insufficient for determining the nature of the transmission which does not 
bear a Mendelian aspect. 

Creased Upper Lip.—In a certain plant in the majority of the flowers the 
upper surface and lip exhibited a conspicuous pucker or crease. This plant was 
crossed with an unrelated normal plant with no crease. Most of the seedlings 
were killed by the violent elements, but four plants were raised, and in one, 
a number of flowers exhibited a crease, which, however, was much less developed 
than in the paternal parent. The data are scanty, but the hereditary trans- 
mission does not seem to be Mendelian. 


Spontaneous Appearance of Wlhate plants—Among the numerous homozygous 
dominant coloured families that have been raised a white plant appeared spon- 
taneously on two occasions in two unrelated families. These plants, of course, 
bred true, and as there was no evidence of contamination of the seed the plants 
must be regarded as new sports. 


5. INHERITANCE OF SEED-LENGTH. 


The mean length of the seed varied considerably in different plants. No 
discontinuous variation could be detected, and inheritance was of the blended 
type. Ten seeds were taken at random from one or more capsules of a number 
of plants of certain series and the means determined. The seeds of a capsule 
exhibited a moderate amount of variation, but they were monomorphic in varietal 
crossings, and not dimorphic as was noticed in an interspecific crossing. The 
distribution was more or less normal. Unfortunately there was very considerable 
variation in the mean size of the seeds in different capsules of the same plant, 
and consequently no very accurate determination of the strength of inheritance 
was possible with this character without an excessive number of measurements. 
As it was, the investigation entailed the measurement of about 1000 seeds. 


A plant, C; (mean seed-length 639 units), was crossed with B, (mean seed- 
length 628 units) and a family was raised; C, x B,=I1. In family II twelve 
plants were selfed, namely II 1, II 2... II 12, the seeds were measured and twelve 
families were obtained. In family II 1 three plants were selfed and the seed- 
length determined, namely (II 1) 1, (II 1) 2 and (II 1) 4. The means of the seed- 
lengths of these three plants were compared with the seed-length of the parent 
II 1. Similarly, for example, in family II 1, 2 two plants were selfed, namely 
(II 1, 2) 5 and (II 1, 2) 20, and the means of the seed-lengths of these two plants 
were compared with the seed-length of the parent II 1, 2. The data are given in 
the accompanying table. 
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Mean Seed-length, Parents and Offspring. 
| 
Parent (selfed) | Offspring (selfed) | Parent (selfed) | Offspring (selfed) ]| Parent (selfed) | Offspring (selfed) 
‘es. | Meme) Desig | Met? | Desi | Mea" | Desig. | Mote] Desig. | Mtr) Desig | Mose 
| nation length nation length nation length nation length nation length nation length 
Ill 606 Il1,1 572 Il 4 592 II 4,8 628 Il 9 653 II 9,3 629 
111,2 | 668 II 4,12 | 598 
Il1,4 | 649 al 
ae eit rd Se a AS 620 TGsal 621 II 10 646 IT 10,1 660 
IL 1,2 | 668 | 11 1,2,5 | 668 if We! 3) | eat TT 10,2 9 |) 7669 
11 1,2, 20) 642 Il 6,4 | 670 TI 10,5 | 649 
oo (ee I1 6,11 | 695 II 10,7 | 660 
11 1,4 | 649 | 111,43 | s655 [— |__| — yf io. _ a 
11 1,4,17| 674 [II 6,11| 695 |116,11,6| 665 ne babes seca 
112 | 528 |-119,1 -| e2¢ | a7 "| 5479) 17,1 en | 2194) CoC ae 
12,3 | 582 Il 7,12 | 570 Eee Es Bt alee 
112,5 | 637 117,14 | 624 JIL 10,5) 649 | IL 10, 5,5) 598 
IT 2,16 | 566 |— - 2% 11 10,5,10| 629 
—— Z 117,1 | @71 | I17,1,7 | 649 Hobe ES) ee) 
113 629 II 3,1 686 a ee == |= MM a fe 
134 | 686 | qs | 620 | 118,2 | 6a9 | 11107] 860 | 11 10,7,9) 653 
113,15 | 672 WBE © PGES. | [is era 
al ie fees ee ID 11 | 615. |-Draay se elrz 
If4 | 592 ) I14,2 | 668 | IT9 | 653 | I19,2 | 633 
11 4,6 | 657 119,11 | 620 | 1112 | 679 | 1112,9 | 642 
11 9,10 | 630 
C, (self-pollen) seed-length = 639 
By, (self-pollen) _ = 628 
C, (By pollen) e = 642, these last seeds produced fam. II. 


The coefficient of correlation, 


calculated from the above numbers, between 


parents (selfed) and offspring (selfed) is 378. This is low for mid-parental corre- 
lation; but as all the generations arose by self-fertilisation we ought to have 
practically no correlation at all according to the pure-line hypothesis, for the two 
original parents (C, and B,) were closely similar to each other in the character 
under investigation. 


6. PURPLE SPOTTING OF THE COROLLA. 


The purple spotting of the lower surface of the corolla-tube and lower lip 
varied greatly in the original parent plants, and the character was obviously 
inherited. The amount of spotting had little relationship to the intensity of the 
general coloration of the corolla, and “white” flowers were sometimes richly 
spotted with purple. 


The percentage area of the lower surface covered with spots was estimated by 
comparing the flowers with a series of diagrams each covered with a definitely 
known percentage of spotting. With practice it was found that sufficiently 
uniform results could be obtained by this method. 
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In plants which had lost completely the power of producing any purple 
coloration whatever, the spots were brown and usually small and scanty, and 
among such plants an almost entire absence of spots of any kind occasionally 
occurred. We have already seen that with regard to the colour of the spots 
(brown and purple) Mendelian segregation takes place. 

In the inheritance of the amount of purple spotting no Mendelian relationship 
could be detected. The smallest amount of purple spotting met with in coloured 
foxgloves equalled about 1°/,, and the maximum about 70°/,. It will be re- 
membered that on crossing a dark purple plant with a plant bearing flowers very 
faintly tinged with purple (say, colour 4 of standard), definite segregation into 
“white” and purple plants occurred in the second generation following; but on 
crossing a plant possessing an abundance of purple spots (say, 50°/,) with a plant 
bearing very few purple spots (say, 2°/, or 3°/,) no such segregation was found, 
and the spotting tended to remain intermediate in amount. 

In the numerous crosses that have been made for various purposes the con- 
dition of the spotting was observed, and it is undoubtedly true that the means of 
the spotting of the families resulting from the crosses tended on the average 
to approximate to the spotting of the mid-parent, $(f + 2). No difference could 
be detected between the reciprocal crosses of two plants. 


Influence of Selection and Strength of Inheritance in Self-fertilised Generations. 

In this connection details of Series IT and III may be given (see p. 120). Plant 
C, with 11 °/, spotting was crossed with pollen of plant B, with spotting 48 °/, = IL. 
Seven of the offspring were selfed and the spotting of the resulting families was 
determined. Subsequently two other generations were raised by selfing. Plant B, 
was crossed with pollen of C,=III. Four of the offspring were selfed and sub- 
sequently three other generations were raised by self-fertilisation. 

The distributions of the spotting in the families of the different generations are 
shown in the accompanying table. In each generation the families are arranged 
in the ascending order of the parental spotting (see the top and middle horizontal 
lines). A casual inspection indicates at once that the general trend of the family- 
distributions follows the gradual increase in the spotting of the parents. 


As an example of selection we may take : 


IIT 2 (9°/,) selfed produced with others a plant III 2, 5 (15°/.) 
IIT 2, 5 (15°/,) selfed - 5 » III 2, 5, 10 (22°/,) 
III 2, 5, 10 (22°/,) selfed “ : - TN 255, Oj 7 (27 +.) 


III 2, 5, 10, 17 (27 °/,) selfed produced a family with mean spotting of 39 °/, 


Thus, we have passed from a plant with 9°/, spotting to a plant with 27°/,, 
which on selfing produced a family with a mean spotting of 39 °/,. 


With reference to the strength of inheritance two tables are given on p. 121, 
one for parents and offspring, and one for grandparents and grandchildren. The 
respective coefficients of correlation are ‘560: and ‘395. This correlation does not 
arise by the mixture of two races which have been sorted out by segregation 


Series If and III. 


Purple Spotting—-Families from Self-fertilised Parents. 


120 Inheritance in the Foxglove 


wl stim | iil iter ri bt) |e) laoesm| 11) 1) tee 
a levtn | Ir it tii lili || oe eworesnr| 1111) inecens1 |e 
AN Ee Toa err re 0Z wets || ineetsses || [a 
a losin |IIT® rl ILili ll il 3] ea [uesem|iillinsee-111;ale 
el yee eee ee ot_jas tte) 1°23 Semen 
x} sem | [Ill bili )oo ttt | 8 pee for er to mn) 1 1) eee nes 
og 8 ‘F II [al Se ea alialialenieee a ferert ei, [Wem eee | | a 
0 | arn |lliliimmeetiiii la mu jasesm| | ties {ti id = 
cc | @‘S il Pee eae ils eg | 2o@ TT | | 1 ements 
“ex aha | Lies seceS Tia | ¢@ TI Mmeigieme se I a tet] 
o | 9H il beerr ye ry riife ed eo | Ii 111 ineeei i) 3 
we | atu | Illillitleltiil (se | oeem | 1 i884" 1111 [3 
ae 9 ‘PII ie i ae aaa ig iy =] ot gt ‘tS III Trpeees1 tid Ea 
wl pom | lilieseempitill (al ot | weer | fil i@ersiii ila 
wf} etm [liitererni tit il |e[ et | e*em | llitse-tiliii|s 
Gay ame, te eet a o | sau | |** ou iemie 
61 € ‘9 IT [| [mwecon | Tee 9 64S en 3) fe ga ee ie 
GI B'4iry | TP ITP PIII ITIL) S ] 3.) ete am | lee cee ie aaa 
ri gan. | [ee ee EN a Sailer « % WT ff. eT ce alae era tes 
él 1911 en clas ce we ie | 9g ihe Lit | paws | fy | | | e 
a) vim | Plle-"tiliili11 (8) oe | esm | 1 (== eReeS mina i= 
ol sSeecr iyi) t ili (star | een | Cle 
Fey eae coca aga a] cL A TY [:  S)e? Sess sla em 
a | em i flleesee- itil} la)! sem |" = sees 
uo] oem | lieor-i iii iit lista | sem |eeere=911111li]e 
91 er |ilimecneceo-i ii lala | sem fine" iili ee 
91 sipslat | eeeeni titi} |e ZI I % III | coe TN ela igen ee 
71 | 69 II | Pacino og A balan bal gee | fC P IIT [oo] RO ea Sie ae cae 
e | ell Pe ern me ar sh e lie rey | Pokal: em Pit hearers Ps 
wma) saoag | ie 08 LUI [ [menanvonnon || 
sunjods 
@nax(moa | 111? 842- 11 F111 é | osm | 1° 8e See 
= aes ee square 
peutes (@F) "7 ee aS oy. Ey laatecs| Gl ee ea 
rome | lees llillitl lil |S favox@owem | li-ece tl Tilt t [& 
. Oo or ~ a +5 Aor PaeSe : 
Sao &@ |onesseaaggseare |=] 8 go 2 |ooosecrane sale 


ERNEST WARREN eit 


during the different self-fertilised generations. Inspection of the tables shows 
that the distributions of the various families give no indication whatever of the 
occurrence of segregation into little spotted and much spotted plants. The 
gradual rise in the degree of spotting of the different parents is followed by a 
gradual increase in the spotting of the respective families obtained by self- 
fertilisation. The fact that the correlation between the grandparents and grand- 
children is less than that between the parents and offspring is further evidence 
that the small, apparently fortuitous, variations in spotting occurring among self- 
fertilised generations are inherited. This result is opposed to the pure-line 
hypothesis, according to which such small variations are regarded as_ slightly 
different expressions of the same identical character which remains unchanged in 
its essence from one self-fertilised generation to another. If such were the case 


Correlation Table—Spotting—Parents and Offspring. Series IT and IIT. 
Offspring. Grades of Spotting. 


Parents. 
Grades 
of 
Spotting 


| 
| 


0O— 3 
4— 7 

s8—11 
12—15 
16—19 
20—23 
24—27 
28—31 
3g — 35) 
36—89 
40—43 
44—hi 


Totals 


Correlation Table—Spotting—Grandparents and Grandchildren. 
Sertes II and ITT. 


Grandchildren. Grades of Spotting. 


Grand- Ror se ee | 
parents. S oN ®@ | G | & S 
Grades of %% IL L I | | tes 
Spotting St} Ol aln | ~Z_] Og 
5—11 8 | 19 | 35 | 19 5 | 2 101 
12—15 21 | 20} 81 | 10} 12 | — 124 
16—19 é } Z 22 | 25 | 37 | 15 2 ik 154 
20—23 2 9] 15} 13 3 1 | — 60 
24—27 - j (0) 4); —};—|—]|— 30 
28—31 2 5 : 6 8 1}—}—]}]— 40 
a EE ee SSS 
Totals 7 | 12 | 17 | 30 | 33 | 66 | 66 | 91 |117| 47 | 20 |’ 3 509 
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the small variations would be fluctuating, non-inheritable variations; but the 
results in the present case are definitely against a supposition of this kind. 

It might be urged by some that the result is really due to the existence of 
genotypes, and that variations within the limits of each genotype are not inherit- 
able. The distributions of the families in the table do not indicate the occurrence 
of genotypes of any considerable magnitude. If the genotypes are supposed to be 
very small the practical result would become indistinguishable from the inherit- 
ance of continuous variations. 


7. Ratio or Breapra to LenerH oF Coro... 

The breadth was measured as the maximum horizontal width across the 
mouth of the corolla of a fully expanded flower in which the anthers had opened ; 
the length was the maximum distance measured along the mid-adcauline surface 
with the lower lip stretched out straight in the long axis of the flower. It is 
; Breadth 1000. The mean of the 

Length 
ratios of the four lowest flowers of an axis was taken as the mean of the plant. 


convenient to express the ratio in the form 


The original parent plants varied widely in this ratio, and the families raised 
by selfing tended to have the same ratio as their parents. 

A plant bearing wide flowers was crossed with one having narrow flowers, and 
the offspring tended to be intermediate. On selfing these offspring the new 
generation exhibited, of course, considerable variation, but taken as a whole the 
intermediate condition was retained, and there was clearly no segregation into 
wide flowers and narrow flowers. Thus, the different degrees of this character 
blend readily on crossing, and the mode of inheritance is very similar to that of 
the spotted condition. 

The results of a multitude of crossings of plants bearing variously shaped flowers 
have been carefully determined and tabulated, and there is no question about the 
general accuracy of the statement made above. In the present place we may 
confine our attention to the self-fertilised generations of Series II and III (p. 128). 


A plant (? C,) with relatively wide flowers (ratio 608) was crossed with a 
plant (f B,) having relatively narrow flowers (ratio 487). The family (= II) had 
flowers approximately intermediate. The reciprocal cross = III. The distributions 
of the families of the various generations raised by selfing are shown in the 
accompanying table. The families of: each generation are given in an ascending 
order of the ratios of the parents. As in the case of the character of spotting it 
will be seen that there is a clearly marked tendency for the mean ratios of the 
families to approximate to the ratios of the respective parents. In none of the 
families do we find any definite segregation into plants with wide flowers and 
plants with narrow flowers resembling those of the two progenitors of the series. 


Wide and narrow races could be raised by selection using only self-fertilisation. 


Thus in family III with a mean ratio of 531 there was a single plant (III 2) 
with as high a ratio as 575. This was selfed and the mean ratio of the offspring 
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Series II and III. 


Ratio of Flower—Families from selfed plants. 
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was 574. In this family there was a plant (III 2, 1) with a ratio of 533 and the 
mean of offspring = 563. III 2, 1,18 (ratio 551) produced a family with mean 
561, and III 2, 1, 18, 28 (ratio 598) produced a family with a mean of 606. 

In the reverse direction, through III 2, III 2, 5, III 2,5, 10 and III 2, 5, 10, 22 
we pass from a parent of ratio 575 to a family having a mean ratio of 477. 

With the data given in the preceding table, correlation tables have been 
prepared for parents and offspring, and grandparents and grandchildren. 


Correlation Table—Ratios of Corolla—Parents and Offspring. 
Series II and III. 


Offspring. 
Parents. 2 | D AP al al ays ge |g | 
Grades of Sic SS || “esinllices ‘ Selle 
. B [ ll le | | | i} Totals 
Ratios L 1000 Slee & s Si = S 
so © © So Ne} © Is) 
4L0—439 1 O 1 4 5 4; — | — 15 
44O—469 | 1 1 8 8 6; 1 | — 25 
ATO—JII — \);— — | — APG Ae39 se soult ton lel 3 1 132 
500—529 — 2 3 8 | 15 | 28 | 45 | 40 | 28 6 2 1 178 
5380—559 — 1 6 | 15 | 29 | 47 | 38 | 32 7 3) —|— 178 
560—589 — 2 12 | 15 | 30:) 34) 18 | 13 Q);—-|—~—|— 126 
590—619 1 2 9; 18] 14 5;/—};—]—]— — 49 
620—659 Ns) | ee ee eee 0 
650—679 = 2, 3 2 4 O 1)/—j})—j;—] — |} — 10 
Totals 713 


Correlation Table—Ratios of Corolla—Grandparents and Grandchildren. 
Series II and ITI. 


Grandchildren. 


Grades of RaSh) aac dn esa yr SSE Pesta PKS ote eresy dP tase yl esi |) hess TS 
8 | L | | | il ia | | ] | Totals 

Ratios L 1000 = 3 = & Se st 2 = & | $ s S 

SS [ci SSS) ic) Bea R i ices || eSmaliecoms ecco acons ecco acs) 

Ee ee a eS | ee ——— 
44O—4O9 —{— |) — |}; — 1 (0) 3 9 9 i — 29 
ATO—AII Ss ee | || 2 3 | 10/15 | 11 3 1 46 
500—529 ey | ey || 4 | 22 | 36 | 26 | 22) 12) 2 | — 123 
580—559 1 3 | 14 | 38 | 37 | 34 | 24; 16) 9 Pe |) al, |p 179 
560—589 —— |" 2 83) LO 138433) Po |e 2ie ie se || — > | = 118 
5II0—619 a fe re Pe ie | Vr a ee ey, aes. || (0) 
620—649 = | pee ae | cae, || mS li (6) 
650—679 — ieee 3 3 {l tae ree ee 10 
Totals 


The coefficients of correlation are ‘601 for parents and offspring and ‘492 for 
grandparents and grandchildren. The latter figure is somewhat high; but taking 
the results altogether they are incompatible with any notion of pure-lines. 
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8. GENERAL CONCLUSIONS. 


In the various characters that have been dealt with in the crossing of different 
strains of the garden foxglove we have seen that in pelorism, colour of corolla and 
colour of spots, the mode of inheritance is Mendelian with reference to the 
qualities: peloric and non-peloric, purple and white corolla, purple spots and 
brown spots. If, however, there are any marked differences in the intensities 
of these qualities, the mode of inheritance of the intensity of the quality was 


found to be of the blended type. 


The other characters examined were quantitative in nature, such as degree of 
the development of purple spots and the ratio of breadth to length of corolla, and 
these characters blended completely. 


When the intensity of a quality is very slight and approaching zero the 
difficulty arises as to which category the individual should be referred. When 
Mendelian inheritance is in evidence the critical point may apparently be determined 
by the occurrence of segregation. Thus, if a homozygous plant with a very faint 
tinge of purple (say an intensity of about 4) is crossed with a homozygous strongly 
coloured plant, segregation occurs in the so-called #, generation, and we obtain 
on the average 1 faintly tinged plant to 3 much more darkly coloured plants. 
When, however, the pale plant has a somewhat greater intensity (say about 10), 
the F, and subsequent generations are intermediate, and definite segregation does 
not occur. In accordance with this procedure a plant with flowers having an 
intensity of general coloration which did not reach 5 of the scale was classed as 
“white.” Without employing such a line of demarcation the results obtained 
were wholly unintelligible. 


From the strict Mendelian standpoint, in the example given above, it would 
probably be affirmed that the faint tinge of purple on “white” flowers is not 
really a fractional part of the general purple coloration of coloured plants, but is 
a distinct character governed by a different factor or set of factors in the chromo- 
somes. To one who has grown the plants this view appears an artificial one. 
In my previous account I stated that there appeared to be a distinct gap among 
my plants between “white” plants and coloured plants, and that colorations of 
about 8—25 of the scale were extremely rare or almost absent, but I have sub- 
sequently obtained a number of plants having such intensities of coloration, 
passing imperceptibly down to absolute whiteness. Consequently it is quite un- 
likely that the faint tinge of purple on “ white” flowers is anything else than the 
last remnant of a general purple coloration. 


It is quite similar in the character of pelorism, but the difficulty in finding a 
suitable method of measuring this character renders the matter less obvious. 
Thus, it would appear that if a character is not present beyond a certain minimum 
or unit quantity it may be unable to blend on crossing with a plant possessing the 
character in a well-marked degree. 
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With reference to the characters which blend, the accompanying table sum- 
marizes the results obtained for parental correlation. Mid-parents and self-fertilised 
parents are regarded as comparable. 


‘ coset of 
Number of orrelation. 
ESOED Offspring Parents and 
Offspring 

Intensity of pelorism (homozygous recessive, | 530 520 

mid-parents and self-fertilised parents) —f : 2 
Intensity of general purple coloration (homo- | 529 “207 

ZY ZOusS dominants, self-fertilised parents) J ! 
Seed-leneth (self- fertilised parents)... 46 378 
Spotting (self-fertilised parents) sto 716 “560 
Ratio of Corolla (self-fertilised parents) ae 713 601 


The probable errors of these results are reasonably small and the average 
coefficient for the 5 characters is 553 which is not far removed from the average 
coefficient found by Professor Karl Pearson for a large number of characters in a 
variety of different organisms. 


It must be again emphasized that these results are based on self-fertilised 
generations of pedigree plants of known gametic constitution, and on Johannesen’s 
theory of pure-lines these parental coefficients should be zero, or at least very 
small. 


The evidence of the present investigation is therefore definitely against any 
general application of the theory of pure-lines and of genotypes of any appreciable 
magnitude, and further it indicates that selective breeding within self-fertilised 
generations of a Bomogencoucs race 1s capable of modifying that race to a marked 
degree. 


EXPLANATION OF PLATE I. 


Figs. 1 and 2.—Pelorism of maximum intensity ; grade 100°. Corollas absent, sessile anthers. 

Figs. 3 and 4.—Perfect pelorism, grade 100°. Corollas joined along their split edges forming a complete 
saucer. Stamens with filaments. 

Fig. 5.—Peloric flower of side-axis; the axis terminates in an ovary. 

Fig. 6.—Pelorism of grade 100°. Numerous flowers fused irregularly forming a rosette, the axis has 
grown through the crown. 

Figs. 7 and 8.—Incomplete pelorism of main axes, grade 75°. A spiral bending often occurs. 

Fig. 9.—Faintly defined pelorism. When such occurred on the lateral axes the plant was said to 
possess a grade of 25°. Side view, and view from above. 

Fig. 10.—F lowering axis of a conspicuous sport in which practically all the corollas are completely split 
longitudinally into four elongated blades. Nature of inheritance obscure. 


The photographs were kindly taken by Dr Conrad Akerman. 


ON POLYCHORIC COEFFICIENTS OF CORRELATION. 
By KARL PEARSON, F.R.S. anp EGON S. PEARSON. 


(1) ONE of the difficulties which are constantly recurring in statistical practice 
is that of the correlation or contingency table in which the two variates are 
classified in broad categories. We may indeed proceed by the method of mean 
square contingency and correct for the grouping of both variates by the class 
index corrections on the assumption that the marginal totals for both variates 
may be assumed to follow approximately normal distributions. Such a procedure 
gives reasonable satisfactory results*, provided the marginal totals are not in very 
unequal groupings and the correlation is not intense (say, ‘85 and above). The 
polychoric table has been discussed by Ritchie-Scott and he has described a method 
of reaching a polychoric coefficient of correlation from the weighted mean of the 
possible tetrachoric values+. Such a process is, however, so laborious that it can 
hardly establish itself in practice. From the theoretical standpoint, however, 
Ritchie-Scott’s paper was of great interest (1) as guiding us by the size of the 
probable errors to discriminate between the valuable and worthless dichotomies in 
tetrachoric determinations of the correlation?, (ii) as providing standard values by 
which those obtained by other procedures could be directly tested. 

We shall endeavour to reach in this paper another form of polychoric co- 
efficient,—that is a correlation coefficient which does use all the information given 
in a polychoric table,—but which requires less analysis than Ritchie-Scott’s weighted 
mean coefficient. Thus what may be lost in exactness will possibly be repaid by 
practical efficiency. There is another point also of very considerable illustrative im- 
portance ; we desire wherever the data are suitable actually to exhibit in the form 
of a graph the relation between the two variates. This should be possible in the 
case of a polychoric table, and in the past has frequently been done by approximate 
methods of more or less validity. 


We can indeed take such methods as our present starting point as they will 
directly indicate to the reader our line of approach. 

We start with the hypothesis that the marginal totals of our polychoric table 
can be represented on a normal scale. This is no great assumption in itself. If a 
true quantitative scale ever becomes available it can be attached at once and with 
little trouble to the normal scale. To exhibit a variate on a normal scale makes 

* By ‘‘ reasonably satisfactory results,” we mean that in cases which can be directly checked by the 
product moment method the difference is within the range of practical insignificance as judged by 
probable error. 

+ Biometrika, Vol. x11. pp. 93—133. 

+ Thus in a 3x3 table it is possible for two of the corner dichotomies, i.e. those unassociated with 


the diagonal in the sense of the correlation, to have even negative weights, so that they should be omitted 
in finding the mean. 
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no greater assumption than when we exhibit a pressure-volume curve as a straight 
line by using a logarithmic scale. 

Now let the polychoric table be such that in the population V under discussion, 
the sth category of the first variate A contains ns. individuals and the s‘th category 
of the second variate B contains n.y individuals, while the number of individuals 
who combine in the population V the sth category of A and the s‘th category of B 
1S Ng’. 

Now when we proceed to exhibit the categories of the A-variate on a normal 
scale, the process will give us two important quantities : 

(a) We shall have the ratio of abscissa to standard deviation at the dichotomy 
between each pair of broad categories. 

If n,., No, Nye, «+. Nge, ».. be the frequencies of the A-variate for the several cate- 
gories the values of the ratios of abscissae to standard deviation will be specified as 
"O08. At, = alae mllee lla pchk emit aN eee hoo 
Here hs, h, are the values on either side of the category n,. and if there be 
q categories, n.; 18 bounded by hy or — % and h,, while n., is bounded by hy, and 
h, or +#. The lower h’s will have negative and the upper positive signs and the 
greatest care must be taken to see that the proper signs are given to the values 

of h. Similarly if the frequencies of the various categories of the B-variate be 
Mens Klltas & Miegy Gt Nig tees 
the values of the ratios of ordinates to standard deviation will be represented by 
= 0, hy, “ha, kgs ses byes hag ee gic Oo 
where ky_, and ky give the dichotomies on either side of n.y. 

We may consider the coordinate at the back of the variate A when represented 
on a normal scale to be a’, the origin being taken at the mean on the normal scale. 
Hence if the standard deviation be o,, we shall find. it convenient to write the 
absolute normal abscissae 

Dt = O70, Ny — aes 

Similarly we take 7’ for the coordinate at the back of the variate B, measured 
from the mean, and write: 

Di ca ee Serre 
where o, is the standard deviation of B. Clearly until a quantitative scale has 
been determined we shall know h, k, 2, y but not 1’, kK, a’, y’, cz and ay. 

(b) We shall determine the ratio of abscissa to standard deviation, or the ratio 
of ordinate to standard deviation of the centroids or means of the groups n,. 


and ny. 


U 1p 2 
Let Ho = ene ee Ky == 258% 


then the means of the categories n., and ny. are determined by 


Nes! 


hy = (His - H)/ oe Tg = (Ke) i, af (i) 
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respectively. The numerical values of h, and ky can be easily ascertained from the 
table published recently of ordinates of normal curve to permilles of area*. Care 
must be taken in every case to give the correct sign to hs and ky. 


Now if there were no correlation, h, and k, combined would give the mean of 
the group ny, and they give a fair approximation to the result if there are numerous 
categories, that is if the range of the categories be small. 


The correlation found from these marginal centroids would then be 
HOS (Meg Jeg P ING tes waa tee cent enone once ec eee eee (ii), 


but as Ritchie-Scott has shown} this 7, diverges much more than rg the mean 
square contingency value from the true correlation, and considerably more than 
the tetrachoric or polychoric coefficients do. The reason for this is clear and was 
pointed out by one of us in 1913+. Namely h, and ky do not give the coordinates 
of the mean of ny. In fact ney hsky is not the contribution of nsy to the product- 
moment. 


We propose in the present paper to give first the actual contributions of ny to 
the means and product-moments of the two variates and then to apply these results 
in order to obtain (a) a polychoric coefficient, and (b) a graph of the relation of the 
two variates. 


The essential assumptions that will be made are the following: 


(i) The marginal totals having been reduced to a normal scale, and the corre- 
lation being supposed to be r, we shall calculate what the contents of the sth-s’ th 
cell would be on the assumption that the frequency surface is the normal surface 
represented by the given correlation and the marginal totals reduced to normal 
scales. We shall further calculate the v-moment, the y-moment and the zy product- 
moment of the sth-s’th cell on the same hypothesis. 


(11) From these data we shall determine the most suitable value to give to 7, 
so that the actually observed frequencies differ least from those that would be given 
by such a correlation surface. We shall also obtain a formula for calculating the 
mean value of y for the array of B-variates, n,. in number, which corresponds to 
the sth category of A. We shall thus be in a position to plot. the regression line of 
Bon A’and test at the same time the closeness with which it fits the thus calcu- 
lated array means, both variates being represented on a normal scale. 


We shall write the real coefficient of correlation of the population 7, the 
coefficient as found from a single sth-s’ th cell, as r,y, and those found from the n,. 
and n.y arrays as rs. and 7.y respectively. 


hss, Ise will be the A- and B-variate means of the sth-s’th cell and Tsy the 
product-moment, per unit of the population, of the frequency in the sth-s’th cell 
about the mean axes as determined from the marginal totals on the normal scale, 


* See Biometrika, Vol. x11. pp. 426-8, 
+ Biometrika, Vol. x11. p. 122. 
{ Biometrika, Vol. 1x. p. 188. 
Biometrika x1v 9 
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(2) The developments we require involve the use of the tetrachoric functions. 
The tetrachoric function of the order ¢ is given by* 
1 ( d ‘3 1 — 172 aie 
1% = —=(- ae GP a ns gence lil). 
‘t}\ da) 2a ou 
The tetrachoric functions 7, to 7, are tabled for positive values of w in Tables 
for Statisticians and Biometricians + to five decimal places. For negative values of x 
tetrachoric functions of an odd order remain unchanged, but those of an even order 
must have their sign as given in the tables reversed. 
It will frequently be needful to take the difference of the tetrachoric functions 
at the boundaries of a marginal category. Thus if 7;(h) denotes the value of the 
tetrachoric function for # =h, we shall need for the sth marginal total 


Tr (hs) — 71 (hs_y). 
This difference we shall write, for brevity, 
ST, 
and in obtaining its numerical value from tables of the tetrachoric functions it is 
essential to remember that s (or s’) is supposed to increase in the positive direction 


of the axis of « (or y), and that when h (or k) is negative attention must be paid 
to changing the sign of the tabled value of 7;, if t be even. 


The formula for determining the successive tetrachoric functions for a given 


value of & is 
Cay (Ye eee et Ph ramen ANE fipsichinaicb boos: (iv), 


where p; and q are given by the following table: 


t Pr qt t Pt qt 


2 | -707,1068 | -000,0000 | 14 | -267,2612 | -889,4990 

3 | °577,3503 | -4082483 | 15 | -258,1989 | -897,0851 
4 | -500,0000 | °577,3503 | 16 | -250,0000 | -903,6962 
5 | -447,2136 | -670,8204 | 17 | °242,5356 | -909,5085+ 
6 | -408,2483 | -730,2968 | 18 | -235,70238 | -914,6592 
7 | °377,9645 | -771,5168 | 19 | -229,4157 | -919,2547 
8 | °353,5534 | -801,7838 | 20 | -223,6068 | -923,3804 


3333333 | -8249578 | 21 | -2182179 | -927,1051 
10 | °316,2278 | -843,2740 | 22 | -213,2007 | -930,4842 
11 -301,5113 | -858,1163 | 23 | -2085144 | -933,5637 
12 | °288,6751 | :870,3880 | 24 | -204,1241 | -936,3819 


13 °277,3501 880, 7047 25 *200,0000 938,9709 


1 


: 1 : : 
Since 7, = —— e~ *”, it can be found at once from the tables for the ordinates 


of the normal curve, and will indeed have been computed at each division in order 


* The reasons why the tetrachoric functions are tabled with the factor ivi! are: (a) because this 
factor greatly simplifies our formulae and (b) because a factor of some such order is essential, if we are 
to have manageable tabulated values. As a matter of fact the factor chosen reduces all tetrachoric 
functions to numerical values lying between 0 and 1. 

+ Cambridge University Press, pp. 42—51. 
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to determine h,. and h.y. It-is then often simpler to work directly with (iv) rather 
than interpolate into the tabled values of the functions. 
In an earlier paper* dealing with the tetrachoric functions one of us has shown 


that if 
N 24 x — BIE y° 
Z= Dar — 9) ° i 172 
be the equation to a normal correlation surface the variates being measured in the 
standard deviations as units, then 
af[N = 17) + 2rtete + 8r2tsts +... HERD tet tee ee eeee (v), 
where tT; = 7;(#) and tT; = 7; (y). 


Now in order to proceed further it is needful to \determine the following 


integrals : 
h, h, 
i 7,00, r aT, aa. 


gy Us 


We can determine these by using (iii) after in the second case integrating by 


parts. We have: 
rh 8 hg f 1 — 1Ly2 
| : T1d2 = ra ; ( | aaa 2 da 


Igy 9 Mbt ty dae] Der 
leer: d is 1 al h; 
—| —_— — Sel — --——_ e a 
Vt! ( dx} JQ hs 
1 
= NG Sure a acBits (a a orale hotel eT Ster rw ial e 69 84.6.5 tial aleieheiece ei arererels (v1) 
: hy h, 1 CN Ga ee 2 
A a 4 = : ——— ( — aa) a — 2 fe 
gain: f, ende= |" ee (- ae) age 
1 


1 ie 1 fh 
=>— aaat= A aP Y= Tepe da 
in hea VE Igy at 
| 1 1 hg 
ad ate Apo — Ti-» 
Ve a Ve hes 
1 i i 9 6 
Se MIT es fe eT eM» aasielelsrerecstrro aiersteis (v1) bis. 
VvtL vt—1 le 
But by (iv): 
a Lt 
ee fee hn 
Pt 
where ; pi=1/Vvt, a =(t-—2)/Vtt—D). 
* Phil. Trans. Vol. 195 a, p. 4, Equation (xiv), with a slight change of notation. In that paper, 
1 —1y2 1 ; : 1 lye w 
——e * »v, =—— is written for 7,,,, and —~e 2? ——-® — for 7’,,,. 
20 "J(n+))! Ce ae ox vec °° 


9—2 


ra) 
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1 a pares 
Thus : Tra @ + Weol Vt 7, +Vt—1 74-5. 


Accordingly 


1 a —— 
=F (VE Sete + NER DS eT) cre rereceeees (vii). 


The latter form throws us back on $,7; which will have to be calculated to 
determine the itegral in (vi) for the successive values of ¢ and s. 
On the other hand a table of 
Tes SNG a4 NE Teg (viii) 


would be a convenient method of determining the integral and tables of 7’ might 
be easily formed, say up to 7%. 


In this case we may write (vii): 


hg 1 : 
| Pies “LT da =— Vi Se Tees, a} do.eiele onssele/efaisrelsietel orefelelorelelatens (ix). 


We are now in the position to compute all the requisite integrals we need; if 
we write 7isy for the contents of the sth-s' th cell, then on the supposition that the 
surface is normal, has correlation r and follows the actual marginal frequencies, we 
have: 

Nisa! hs 


| gaa 

— => AXA 

N Nig J sy N 

=D 7TodeT) HLM eT HT Neto neteit o.. Toe Tp de Tp ee 


Nee = h ks aedadr 
use sg) = | ee a Sloe Ta ar Ts Tea + Ss Tey Te, 


aN Ng J hs N 
Hoe PS gd pie Gp oh see creer (x1), 
Nes 7 Ns [ks yedady ; ary ee ae py, 
N Msg - Ve fe arg : ey SsTo Ss vi aE r3sT Sy 16 + TS, Tage 
“pees ee Tone Lip hoe aceasta ene (x11), 


Neat hg (ks gygdad n i Q mer ny / 
y= | Eee! 2ST, 8y TS, Be eS ge 
N Ny_y J Ks N 

shinee TPS lig tel p oe ae Seo (xni). 


It is desirable to say a few words about the functions 7, and 7, which may at 
first present difficulties to the reader. — 7, clearly stands for the integral 


hs 1 Say? : hs 
i ie Shae, | 7x, 
ee \/ Qar (Beas 


and is therefore simply ;./V. 


Similarly — 7,/ = n/N. 
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Next clearly — 9,7, stands for 


Is T,0da = iy es ae: da 
pa hs, N20 
i Lt a,2ts 
Ey = ; eat 
=—%s71, 
or Sp ll = Sere 


which is precisely the value given by (vill). 

Thus (vii) is shown to be correct even for this special case although a form 
like (vi) bis through which it is reached shows difticulties. 

Similarly 3; 7) = 357,'* 

The remainder of the 7’s knowing 7, and 7, come directly from (iv) and the 7"s 
are always given by (viii). 

Now it is clear that (x) to (xii) provide a large number of ways of deter- 
mining r. We might find 7, Le. ry, from the single cell by writing in (x) ny for Tisy. 
Or we may oe 


ibe == 8 (Msg sy) 
="s (7 Nellans to ce Mi oeel oy la + ese 1’ Nel pet =) (xiv), 


where fisy 18 given by (x). But h;. is the known centroid of the n,. marginal total, 
and accordingly the above is an equation to find r, Le. ry, from a given column of the 
table. 


If we use this value of 7. in (x) and (x11) to find 7i,y, and k,., we obtain the 
theoretical cell frequency and y-mean of the cell as found from a column. 


Now sum fk, for every value of s’ and we find k,. the y mean of a column 
depending on the data as found from the column, Le. 


Kee = a S G 


ges’ 


Sry Sy Ty +7 Spt ILI oe EPPS tp Sy Ly + sal) (xv), 


where ny 1s the observed cell frequency and fisy the frequency found by (x) when 
we insert the value of 7 as found from (xiv). We are thus in a position theoretically 
to determine on a normal scale the mean of a column from the correlation actually 
determined from that column, This would be the ideal method of determining the 
mean of a row or column; but it would involve a great deal of hard work, as with 
the two regression curves we should need to find r for every row and column by an 
equation of a high order. Hence in most cases we are likely to content ourselves 
by finding r for the whole table and then use this value in (x) to determine fi, 
and in (xv) to find the mean of the array. /. plotted to the known h,. on the 
normal scale will give the regression curve. 


* We can thus take Ty)=7, and 7')'=7)’. 
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The question now arises as to the manner in which we can find r for the whole 
table most effectively. 


Clearly we might assume the product-moment components from (xiii) and sum 


for all cells. We should have 
Ness'Tss'\ 
2 ( =) Sine 


since the coordinates are measured from the means in terms of the standard 


deviations as units. 


Hence substituting from (xii) we have: 
pas (2 RS AS SO LAS SS a} (xvi). 


Here fis must be substituted from (x) and we have finally 


rn 8 (Ne PARBCTY 40RD 4. tO STY 4} 
5.6 \V ol Sete To PHM Oem tase t Net epee 
This equation based upon the product-moment method of finding r is clearly 

likely to be very complicated, and although it can be proved that the product- 

moment method is the “best” method of finding 7 when we are dealing with 

a series of quantitatively measured individuals, we have no certainty that it is the 

best method in the present case of broad categories. It may indeed be questioned 

whether another method now to be considered cannot be shown to be better or 


at least equally efficacious. 


) (xvi) bis. 


Let us consider for a moment what we have in view. We observe ny as the 
frequency of the sth-s’th cell; we find that with a given correlation r the frequency 
of this cell would be 7. on the assumption that the frequency surface is the normal 
frequency surface corresponding to the observed marginal totals. Accordingly the 
most probable value to give to r would be that which made 


v= § ~—— = minimum, 
or, what is the same thing, 


Noct Sean 
S (5) = minimum. 


This leads us, differentiating with regard to 7, to 


oy (Ga) a 
Ss = — 0, 
s, 8’ (Msgr dr 


or, writing at length, our equation for r is: 


s (Ge) yy ar + 2rd5Tave Ts +...4+ pr? Ss Tp de! Tp Gr oat xl =) (xvii). 


N (Soe to. + C367 Ty Tr + Sete vate. + eee + PON Ty Ve ca ar eae 
Neither (xvi) nor (xvii) are very readily solved. Probably the easiest way will 
be to obtain an approximate value of 7 by existing methods either from a good 
fourfold table, or from contingency, and then evaluate (xvi) or (xvi) for values of 
7, one well above and one well below this result, so that the real value of 7 les 


S58 
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between the two. A linear interpolation will probably suffice in most cases to 
determine r with sufficient accuracy. 

It will be observed that what we are trying to do is to fit a normal correlation 
surface to a series of cell frequencies. We may do this by equating product- 
moments, or actual cell frequencies properly weighted. The factors = and (=) 

ss’ ss’ 
come into our equations as a form of weights. When ny is small as compared with 
Nis that cell will contribute less to the general equations for 7, and when ‘,," is 
large as compared with f,, the contribution will be considerable. If the observed 
results were closely normal then ns,» would be nearly ri. If we might assume the 
differences of ny and 7isy so small as to be negligible we should have: 


PASS, Love Lo booed iy ly... PTS Lely te (Xvi) ter, 
sys 

and O= SHSen3em + 2r3sT yt. +... fpr? 13,759 "T, +... (xvii) bis, 
8,8 


instead of (xvi) bis and (xvii). These equations it will be found are identically 
satisfied. Hence our values for 7+ from (xvi) and (xvii) depend on fi, differing 
from. Neg". 


(3) We now proceed to illustrate the application of these results. 


Stature of Father and Son. 

The following table gives a correlation table for the inheritance of stature in 
Father and Son made up in broad categories corresponding to eye-colour groups*. 
Upon this material we shall be able to test our correlations and our graph against 
those found by definite numerical groupings. 

Stature of Father (Broad Categories). 


5 Totals | 
re 
S 
a if 1 34 
Sp Oss 8 301 
B-z 3 87 75 66 22 284 
QS y zoo aesocl mar. | le 137 
ey — 

o.8 5f — 18 27 26 11 105 
2s 6' — i 98 
8 o — é 6 4] 
és 
Totals 
i 


The positive direction of w is from left to mght and of y vertically downwards. 
It will suffice to take the 7’s to five decimal figures but it will be needful to go 
further with the 7’s if the 7’s are to be taken correctly to five figures from (viii). 
The general reduction formula for the 7’s is: 


Ty (w) t= 2) Né=1 @ — Tyo(w) (t= 3) (= I a* +) 


T,(@)= vélt =i) (a(t—2) 41) ..-(XV1I1), 
: 7 ; -—3, ee 
or, = Cen je T,_, (#) — (# (¢-1)4+1) 5 D5 @} (xvii) bis. 


* See Biometrika, Vol. 1x. p. 220. 
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Hence if 7, and 7, be found accurately the remaining 7’s can be determined as 
accurately as we please without reference to the 7’s. 


But, fe ee ck ee ee MR OAC AE ey cea mee ntti ni Hc (xix) 


Hence the tables of ordinates and areas of the normal curve readily provide 7) 
and 7’, to seven decimal places, and (xvii) provides the higher 7’s. These were cut 
down to five figures and an approximate check on their values obtained by (viii). 

As a matter of fact if 7 is of the order °50 we cannot hope to obtain more than 
three figure accuracy in 7 without going to higher t- and 7-functions than the 
sixth, especially when using (xvi). But three figures in the correlation are usually 
adequate and the labour of computing is much increased if higher functions are 
used. Such must, however, be used if the correlation be sensibly higher than ‘50. 

The following table gives the } (1+ .)’s, h’s, H’s, @’s, 7’s, S7’s, 7’s, and Y7’s for 
the a-variate. 


TABLE I. 
(cua Bore 7 7 
$(1+a) 0 036 °358 | *622 °802 871 972 | 1-000 
h —o |—1°'79912 | —°36381 | +°31074 | +°84879 |+1°13113 |+1:91104 +a 
H=7, 0) ‘07908 37340 *38014 *27827 21042 06425 10) 
z,= na! — 2s | 219667 — 91404 | —-02553 | +°56594 | + 98333 | +1:44723 | +2-20464 
Q\G3— Ss—1 \ 
TT) 0) —°036 —°358 — 622 — 802 — 871 -- 972 —1 
m=T) 0) +:'O07908 | +°'37340 | +°38014 | +°27827 | +°21042 | + 06425 0) 
T2 0) — ‘10060 | — 09606-} + °08353 | +°16701 | +°16830 | + °08682 0) 
T3 0) + 07221 | —°13226 | —:14021 | — ‘03176 | +°02401 | +°06952 0) 
TA 0) — 00688 | +:°07952 | —:07001 | —:10990 | — -08359 | + 01634 0) 
TS 0) — 04291 | +°07579 | + °08432 | — 02041 | —-05839 | — 03270 0 
T6 0) + 03654 | — 06933 | +°06182 | + 07319 | +:03408 | — -03744 0) 
Ir — 036 — 322 — '264 — ‘180 — 069 — 101 — ‘028 
Ir, +:°07908 | +°29432 | +°00674 | —:10187 | —:06785 | —°14617 | —-06425 
Ire ~ 10060 | +°00454 | +°17959 | +:°08348 | +°00129 | — 08148 | — 08682 
Ir; +:°07221 | —:20447 | — 00795 | +:°10845 | +°05577 | +°04555 | — 06956 
Sry — ‘00688 | + °08640 | — 14953 | — 03989 | +°02631 | +°09993 | — :01634 
Ir; — 04291 | +°11870 | +°00853 | —-+10473 | — 03798 | +°02569 | +.:032'70 
It +:°03654 | —:10587 | +°13115 | +:°011387 | —:03911 | —:07152 | +:°08744 
if 0) — ‘17827 | —:49385 | — 50388 | —°56581-| — 63299 | — °84922 -l 
Ts 0) +°23690 | +°29898 | +°29475 | +°33853 | +°33915 | +°21135 10) 
TE, 0) —°18799 | —-00734 | +°00466 | + 06947 | +°12432 | + °18307 0 
7, 0) +:°04848 | — -09506 | —°09186 | —:10916 | —:08255 | + -06601 O 
Ts 0) + 07412 | +:07799 | —:00511 | —:06648 | — -°10343 | — °05518 0) 
Ts 0) — (08325 | +:°05616 | +°05364 | + °05379 | +°01471 | — :08491 0) 
ST, +:07908 | +:29432 | +-00674 | —:10187 | — -06786 | —:14617 , — 06425 
ST, — ‘17827 | —:31558 | — 01003 | — 06193 | —-06718 | —°21622 | —-15078 
IT, +°23690 | +:06208 | —°00422 | +°04377 | +°00063 | —°12780 | —°21135 
OT. — 18799 | +°18065 | +°01200 | +-°06481 | + °05485 | +°05874 | — °18307 
ST, +°04848 | —'14354 | + 00320 | —:01731 | +:°02662 | +°14856 | — -06601 
IT; +°07412 | —:06613 | —-°01310 | —:061387 | — :03695 | +°04825 | 4+°05518 
ST; — 08325 | +°13941 | — 00253 | +°00015 | — -03907 | — 09962 | + °08491 


A 
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The following table gives the corresponding quantities $(1 + a’)’s, k’s, K’s, ys, 


T's, ¥7,’s T’s and STs for the y-variate. 
TABLE II. 
| 
4(1+a’) 0 034 | 335 619 He | | 1:000 
k —o !—1°82501 | — 42615 | +°30286 | + 69349 la 08482 |+1:73920 +0 
K=r, oO |+ 07545 | +:36431 | +-38106 | +-31367 |4+ 22149 |+ -08792 0 
7 — Ms-1- As | 96 | “ORO | “ORR0R | 5 \ *Q77QC 2630 214436 | 
leeat—a,)\ | 2 21916 | 95968 | —-05896 | + °49188 | +°87789 |+1°36301 he 36 | 
To. 0) — 034 — 385 - ‘619 | —*756 — ‘861 —"959 | -1 
1 =T) O | +:07545 | +:36431 | +-38106 | +°31367 | 422149 | + 08792 0 
2, 0 ~— ‘09737 | —°10978 | +°08160°|} +°15382 | +°16990 | +°10812 0) 
T3, 0) +:°07179 | —°12172 | -—°14130 | — -06647 | +:01599 | +-07268 0) 
Ta. 0) — 00929 | +°08932 | —-06851 | — +11185 | — 08942 | +°00077 0) 
Ts, 0 —-04057 | +:06463 | +-08551 | +-00990 | — -05411 | --04815 O 
ae 0 | +°03702 | —-07647 | +-06061 | +-08449 | +-04134 | —-03475 0 
Ir) -—°034 | —'301 —'284 | —'137 —'J05 — ‘098 | - 041 
Sr +:07545 | +°28886 | +°01675 | — 06739 | —-09218 | —°13857 | — :08792 
Ire! — ‘09737 | —°01241 | +°19138 | +°07222 | +°01608 | —-06178 | — -10812 
Ir3 +:07179 | —*19351 | — 01958 | +:-07483 | +°08246 | +°05669 | — :07268 
Irq — 00929 | +:09861 | - 15783 | — 04334 | +°02243 | +°09019 | —-00077 
Irs — 04057 | ++10520 | + -02088 | — -07561 | —-06401 | + 00596 | + 04815 
rg | +-03702 | —-11349 | +-13708 | +-02388 | —-04315 | —-07609 | +°03475 
Yee 0 — ‘17170 | —°49025 | — °50360 | —°53847 | —-62073 | — °80610 —1 
T, a0) +°23105 | +°30489 | + °29416 | +°32847 | +°34094 | +°25021 | 0 
Ts, 0) — 18723 | —°O1151 | +°00432 | +°04271 | +°11544 | +° 18882 | 0) 
a 0) +°05286 | —:09892 | —-09140 | —+11081 | —-O8901 | + 03768 | | 0) 
Ts! 0) +°06989 | +°01240 | — 00474 | — 04316 | —°09869 | — -08340 | O 
Ts. O | —:08412 | +°05897 | +°05326 | +°06263 | +°02276 | —-08010 O 
Ly +:°07545 | +°28886 | + °01675 | —:06739 | — 09218 | — °13358 | — :08792 
ST —-17170 | —-31855 | — -01334 | —-03488 | — -08225 | —-18537 | —-19391 
ST, +:23105 | +:07334 | —-01023 | +°03431 | +-01247 | — "09072 | — ‘25021 
ST — 18723 | +°17572 | +°01583 | + °03839 | +°07273 | +°07338 | —*18882 
oT + 05286 | —°15178 | +°00752 | —-01941 | +°02179 | +°12670 | —°03768 
ITs! +:06989 | — :05749 | —°01714 | —-03842 | — 05553 | +°10529 | + °08340 
IT; | — ‘08412 | +°14309 | — 00571 | + °00937 | — ‘03988 | — 10286 | +°08010 
| i 


From Tables I and II we can find from (x) the value of 7,;/M for any given 
value of r, and by equating 7iss/N to nss/N we should have an equation to determine 
the correlation 7 from that cell alone. The weighted mean of these 49 7’s would 
be Ritchie-Scott’s polychoric correlation coefficient. But the labour would be 
immense*, 
see Table III, p. 138. 
There are certain checks on the accuracy of this table, namely 
0, when it =1. 


We are now in a position to give the product of S,7,)Sy 7," : 


Ss ‘3s Tp Ys Tp = 0 except for p = 


* We are not underrating the large amount of arithmetic of the present process. It is not likely to 
be often repeated, and the sole purpose of publishing all these tables for an individual case is to impress 
the reader with that fact; while at the same time illustrating the actual numerical processes. The 
amount of arithmetic, great as it is, is relatively small compared with that of solving and weighting the 
resulting 7’s in the case of a 49-cell table. 
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TABLE III. 
Values of Setp3¢' Tp - 
s’ | p s=1 f= s=3 s=4 s=5 s=6 s=7 Pp 8 
0 |+-001,224 | +-°010,948 + -008,976 | + -006,120 |-+ 002,346 | + -003,434 | + 000,952 | 0 
1 |+-005,967 | + 022,206 | + 000,509 | - 007,686 | - 005,119 | — 011,029 | — 004,848 | 1 
2 |+-009,795 | — 000,442 | — 017,487 | -- -008,128 | — 000,126 | + 007,934 |4+-008,454| 2 
L | 3 |+-005,184 | — 014,679 — 000,571 + -007,786 + °004,004 |+-003,270 |—-004,994] 3 | 1 
4 | + 000,064 | — 000,803 | + °001,389 + -000,371 | — 000,244 | — 000,928 }+ 000,152) 4 
5 |+7001,741 | — 004,816 | — 000,346 | + 004,249 + 001,541 | — 001,042 | — -001,327 | 5 
| 6 | +-001,353 | — 003,919 + -004,855 | + -000,421 |— 001,448 | — 002,648 | + “001,386 | 6 
= eB speed ale ee abn ae 
0 |+:010,836 | + 096,922 | + ‘079,464 + 054,180 | +-020,769 |+-030,401 | + °008,428) 0 
1 |+°022,843 | + -085,017 | + -001,947 | — -029,426 | - 019,599 | — 042,223 |— -018,559| 1 
2 |+ 001,248 | — 000,056 | — 002,229 | — 001,036 | — -000,016 }+ -001,011 |+-001,077| 2 
2 | 3 | - 013,973 | + 039,567 | + -001,538 | — -020,986 |— -010,792 | — 008,814 |+°013,461} 3 | 2 
4 | --000,678 | + 008,520 | — 014,745 | — 003,934 |+ 002,594 |+-009,854 |— 001,611] 4 
5 |-:004,514 | + -012,487 | + 000,897 |— -011,018 | — ‘003,995 | + °002,703 |+ 003,440 | 5 
6 |—-004,147 | + -012,015 | — -014,884 | — -001,290 |-+ 004,439 | + 008,117 | — -004,249| 6 
O |+:010,224 |+ -091,448 | + 074,976 |+ 051,120 |-+ 019,596 | + 028,684 |+ "007,952 | 0 
1 |+-001,325 | +-004,930 | + ‘000,113 | — -001,706 | — 001,136 | — ‘002,448 |- -001,076 | 1 
2 |—-019,253 | + 000,869 |-+ 034,370 | + -015,976 | + ‘000,247 | — 015,594 |— 016,616 | 2 
3 | 3 |--001,414 |-+ -004,004 | + 000,156 | — 002,123 | -- 001,092 | — :000,892 |+ 001,362] 3 | 3 
4 |+:001,086 | — -013,637 | + -023,600 | + 006,296 | - 004,153 | — 015,772 |+ ‘002,579 | 4 
5 |— 000,896 | + -002,478 | + -000,178 | — 002,187 | — 000,793 |-+ -000,536 |+ °000,683 | 5 
6 | + 005,009 | — -014,513 |+ 017,978 | +-001,559 | — 005,361 | — “009,804 |+-005,132| 6 
0 |+ 004,932 |+-044,114 |+ 036,168 | + -024,660 |+ 009,453 | + 013,837 |+ ‘003,836 | 0 
1 | — 005,329 | — -019,834 |— -000,454 |-+-006,865 |+ °004,572 | + ‘009,850 |+ 004,330 | 1 
2 |—-007,265 | + 000,328 | + -012,970 | + 006,029 | + 000,093 | — 005,884 | — 006,270 | 2 
4 | 3 |+°005,403 | — -015,300 | — -000,595 | + -008,115 |+ -004,173 | + -003,409 | — 005,205] 3 | 4 
4 |+ 000,298 | — -003,745 |+ 006,481 |+ 001,729 |— 001,140 | — 004,331 }+-000,708| 4 
5 |+ 008,244 | — -008,975 | — 000,645 | -+ 007,919 | + 002,872 | — 001,942 |— ‘002,472 | 5 
6 |+°000,873 | — -002,528 | + 003,132 |+ 000,272 | — 000,934 | — 001,708 }|+ 000,894 | 6 
0 |+ 003,780 | +. 033,810 |+ 027,720 |+ :018,900 |+ 007,245 | + 010,605 | + °002,940) 0 
1 | —-007,290 | — 027,130 | — 000,621 | + -009,390 |+ 006,254 |+ °013,474 |+ 005,923] 1 
2 |—-001,678 | + 000,073 | + 002,888 |+ 001,342 |4+-000,021 | — 001,310 |— 001,396 | 2 
5 | 3 |+:005,954 | — 016,861 | — -000,656 | + 008,943 |+ -004,599 |+ °003,756 |— ‘005,736 | 3 | 5 
4 |—-000,154 | +-001,938 | — -003,354 | — ‘000,895 |+ 000,590 |+ 002,241 |— 000,367 | 4 
5 |+-002,747 | — -007,598 | — -000,546 | + -006,704 |+ 002,431 | — 001,644 |— 002,093 | 5 
6 |—-001,577 |+ 004,568 | — -005,659 | — -000,491 |+ 001,688 |+ 003,086 | - ‘001,616 | 6 
O |+°003,528 |-+ °031,556 |+ -025,872 | + 017,640 |+ 006,762 |+ 009,898 |+ 002,744 | 0 
1 |—-010,563 | — 039,312 | — 000,900 |+ 013,607 |-+ 009,063 |+ °019,524 |+ ‘008,582 | 1 
2 |+ 006,215 |—-000,280 | — -011,095 |— -005,157 |— 000,080 |-+ 005,034 |+ -005,364 | 2 
6 | 3 |+-004,094 |—-011,591 | — 000,451 |+ 006,148 |+ 003,162 |+ °002,582 |— ‘003,943 3 | 6 
| 4 |—-000,621 | + -007,792 |— -013,486 | — -003,598 | + 002,373 |+ °009,013 |— 001,474 | 4 
| 5 |—-000,256 |-+-007,107 |+ 000,051 | — -000,624 | — -000,226 |+ -000,153 |+ ‘000,195 | 5 
6 |—-002,780 | + -008,056 | — -009,979 |— -000,865 |+ -002,976 |-+ "005,442 | — 002,849 6 
O |- -001,476 |-+ 013,202 |+ -010,824 |+ -007,380 |+ 002,829 |+ 004,141 |+ °001,148 | 0 
1 |= 006,953 | — -025,877 | — 000,593 |-+ 008,956 | + -005,965 |+ ‘012,851 |4 ‘005,649 } 1 
2 |+:010,877 | — 000,491 |—-019,417 | — 009,026 | — -000,139 |+ 008,810 | + °009,387 | 2 | _ 
7 | 3 |= -005,248 |+-014,861 |+-000,578 | — -007,882 | — -004,053 | — 003,311 }+°005,056| 3 | 7 
4 |+-000,005 | — 000,067 | + -000,115 | + -000,031 | — 000,020 | — 000,077 |4+ 000,013 | 4 
5 |—-002,066 | + -005,715 | + -000,411 | — -005,043 | — 001,829 | 4+ 001,237 |+°001,575 | 5 
6 |+-001,270 | — -003,679 | + -004,557 |+ 000,395 | — 001,359 | — 002,485 }-+ 001,301} 6 
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Applying these tests we find : 
Sey (3s To Iv Te) =  1°000,000,  Syy (S57, Sy 71) = + 000,001, 
Ssy (3, T2939 T2)=+ 000,001, Sey (Se 73 Sy Te) = + 000,008, 
Sey (95 T4139 Ts) =— °000,002, See (S575 99 7s) =+°000,001, 
and Sys (35 Te Ie Te ) = + 000,002, 


results as close as we should expect, when we take into account the fact that our 
$7’s were only to five figure accuracy, and our products to six. 


The meaning of Table IIT should be quite intelligible ; namely, for example : 


084 = “Y= = 079,464 + :001,947 r — 002,229 7? 

+ 001,538 7? — ‘014,745 14 + 000,897 7° — 014,884 75 +... ...(xx1) 
is the equation which will give the correlation coefficient r as deduced from the 
(3, 2) cell. If r be given any other value the mght hand of the above expression 
is equal to the contents of the (8, 2) cell for a normal correlation surface of corre- 
lation coefficient * having the observed marginal totals. 


Thus far the ulnuntgate is absolutely comparable with that needed for nie: 
Scott’s “polychoric rv.’ We should have to solve the 49 equations, and then 
calculate—the stiffest part of the work—the probable errors of the 49 correlation 
coefficients which are the roots of these equations. “Using these probable errors as 
our weighting data, we should find a mean coefficient. Our purpose is to replace 
the weighting and the solution of the 49 equations by the solution of a single 
equation. It will be noticed that both Ritchie-Scott’s and our methods have an 
undesirable limitation, for we both assume the marginal totals to be those of the 
normal correlation surface. Actually in our case we ought to treat the marginal 
totals as unknown, or select h,, he, h;,... hq, ky, ke, hy, ... ky as well as r to give as 
closely as possible the observed frequencies. Now the 7’s and consequently the 
T’s and $7’s and $7"s all depend upon the /’s and k’s and the equations obtained 
by making 

Nos! nae 

ae (=) = minimum 

Tiss’ 
do not appear to lend themselves to any reasonably brief system of solutions. We 
were compelled therefore to introduce the admittedly limited form of solution, i.e. the 
determination of the best normal correlation surface subject to the restriction of 
its having the same marginal totals as the observed frequency surface. We con- 
sider this a practically necessary but none the less grave restriction. 

We next proceeded to determine the value of figy/nsy and (Ty/nsy)? for certain 
selected values of 7 in order to build up equation (xvii) and solve it by inter- 
polation. The values chosen were: 0°45, 0°50 and 0°55. These cover the range 
within which we anticipate the solution of (xvii) for 7 will lie. We need also the 
value of the numerator in (xvii), i.e. 


Veg = Ye T Se TH F295 Te Sy Te +2729, 7399 Te + -.-; 
for the same three values of 7. These results are given in Table IV. 
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TABLE IV. Values of (isy/Nss'), (Tiss'/Nss')? ANA Ves’. 
St Function sil s=2 9=8} s=4 s=5 s=6 G=7/ 
Tiss'|Mes’ (2) | 1°602,750 | -879,954 | °814,714 wo "388,000 co co 
fe) | 2119000, -oie'da| verresr| so | aeeoo0| é 
= , 6e) | 2°115,5¢ "916,405 587,85 ao “174,000 eo) <9) 
(Tisg:/g9') (@) | 2°568,806 | °774,321 663,759 ea "150,544 ry oo 
1 (b) | 3:°407,716 | +813,604| -497,830 2 070,756 00 2 
(c) | 4:475,340| 839,805 | 345,576 oo ‘030,276 ca <0 
vs (a) | + 018,462 | + 011,177 | — °014,603 | — 009,218 | — 002,733 | — ‘002,747 | — 000,336 
b) |+ 020,480 |+ 008,113 | — -015,910 | — 008,382 | — -002,154 | — -001,929 | — 000,218 
(c) |+°022,694 |+ -004,477 | — 017,013 | — 007,243 |— -001,519 | — 001,228 | — -000,168 
isy'Msy (2) | *867,348] 905,539 | +944,250| 1°478,500 | 1:379,000 | 1-887,333 oo 
b)| 894,565) -944,630| 939,833 | 1:383,615 | 1-215,375 | 1:544,667 wo 
_ _ (¢) | 915,130) 986,935) -933,333 | 1-278,500 | 1-043,500 | 1:213,333 Ee 
(ise’/Nas’)” (@) | °752,293 | +820,001 | °891,608 | 2-185,962 | 1-901,641 | 3°562,026 oo 
2 (b)| +800,247| -892,326| 883,286 | 1:914,390 | 1:477,136 | 2°385,996 ea 
(c) | *837,463| 974,041) 871,110) 1-634,562 | 1-088,892 | 1-472,177 wo 
vey (@) | + 013,846 | + °116,000 | — -005,963 | — 046,943 | — -025,552 | — 041,623 | — 009,764 
(b) | + °011,084 |4++125,051 | — 009,011 | — 051,854 | — 026,828 | — 040,529 | — 007,913 
(¢) | + 007,767 | +°135,874 | — 013,006 | — 057,659 | — 028,171 | — ‘038,864 | — -005,940 
Tisg[Nse (4) | *857,750 | 1:075,552 | 1°108,280] -812,500]| °854,818| -984,375 | 2°194,000 
(0) | °751,875 | 1-076,195 | 1°138,747 | -823,410] 844,773 | 930,333 | 1°846,500 
_ (e)| °635,750 | 1:075,448 | 1°175,027 | -835,909 | -831,636 | 866,000 | 1°486,500 
(Mgs-/Msq)? (@) | *735,735 | 1°156,812 | 1°228,285 | -660,156 |) °730,714| *968,994| 4:813,636 
3 (6) | *565,316 | 1:158,196 | 1:296,745 | -678,004| -713,641| 865,519 | 3:409,562 
(c)| 404,178 | 1-156,588 | 1°380,688 | -698,744| -691,618| -°749,956 | 2:209,682 
vey (a) | — 016,095 | + 002,075 |+ 041,770 | + 013,402 | — -003,847 | — 023,749 | — 013,555 
(6) |— °017,786 | + 000,037 | + °049,827 | + 015,435 | — -005,038 | — 028,268 | — ‘014,205 
7 __(¢) |= 019,311 | — -002,805 | + ‘059,278 | + 017,601 | — -006,601 | — 033,522 |- ‘014,539 
Tiss'[Msy (@) | 1°634,000 | 1°155,897 | 1:078,222| -808,892 | 850,571 | 1°225,786| -671,833 
(6) | 1:260,000 | 1-096,966 | 1°098,417 | -837,135 | 877,714 | 1:239,929| -627,333 
- (c) | *917,000 | 1-030,862 | 1°121,944| -869,568 | -907,429 | 1-250,000| -570,000 
(7iss’/sg')? (@) | 2°669,956 | 1°336,098 | 1°162,563 | -654,306| -723,471 | 1°502,551 | -451,360 
4 (6) | 1°587,600 | 1-203,334 | 1:206,520 | -700,795| °770,382 | 1°537,424| 393,547 
(c)| 840,889 | 1-062,676 | 1°258,758 | -756,149 | °823,427 | 1°562,500| 324,900 
Ye (@) | — 007,715 | - -032,319 | + 013,434 |+ -019,505 | +-007,261 |-+°004,459 | — 004,625 
(0) | — 007,215 | — -036,182 | + °015,696 | + -022,370 | 4 007,947 | + 003,430 | — 006,095 
(c) | — 006,471 | — -040,720 |+ 018,237 |+ 025,717 |+ 008,735 |+ ‘002,185 | — 007,680 
Tiss'[Nsy () oo 1:114,278 | 1°028,556 | 934,423} -960,545| 935,111] 946,600 
(0) co 1:006,167 | 1°027,185 | +969,000 | 1:008,273 | 978,944 | -944,400 
: a) ca -890,389 | 1°024,148 | 1:007,692 | 1-061,727 | 1°025,111 |] -927,400 
o [Orr] 2 | Louw] 8600 | Sasa | athe | ‘ace | “Setar 
oe) 012,372 055,10! Boe ) : - 958 ,3% 4 
(c)| ow 792,793 | 1:048,879 | 1-015,443 | 1+127,264 | 1-050,853 | -860,071 
vey (&) | — 004,797 | — -037,653 | — 000,381 | + 017,025 | + 009,967 | + 015,398 | + -000,440 
(6) | — 003,957 | — 040,252 | — 001,134 | + -018,995 |-+-011,095 | + 016,166 | — -000,916 
(c) | — 002,988 | — 043,158 | — 002,230 | + 021,305 |+ 012,465 |+ 017,118 | — 002,508 
iss/Nsg_ (CL) ee) 1:461,333 | °867,077 | 1:216,474 | 1°604,286 | °701,931 | -906,500 
(db) 20 1°224,000 | °830,576 | 1-245,526 | 1°693,714] °754,966| -969,125 
7 , (2 co ‘988,111 | °786,077 | 1-273,789 | 1°791,000| 812,828 | 1:028,375 
g [ore | |} Tawive| -esersee | rseirsae | a-cnerenel) seu maa Maa iene 
“498,17 : 5 "551,385 |, 278 367 | 75 7 ; 
9 2 ‘976,363 | °617,917 | 1°622,538 | 3-207,681 | °660,689 | 1-057,555 
vey (4) |= "008,069 | — 042,728 | — 017,170 | + 011,165 | + 012,060 | + °029,542 | +4 010,202 
(b) | — 002,189 |—-042,658 | — 020,931 | + -010,905 | + -013,028 | + -032,069 | + -009,778 
(ce) | — ‘001,381 | — -042,197 | — 025,479 | + -010,572 |-+ 014,219 |+ 035,116 | + 009,152 
Mggi[Nsg (4) ce) ‘961,333 | °747,556 | 1-462,667 | -845,000 | 1:140,500] -870,286 
(b ea ‘705,000 | °648,556 | 1°411,167 | 865,000 | 1-235,000 | 1-003,143 
- ., (2) a0 "491,000 | 542,000 | 1:337,333 | 877,000 | 1°331,000 | 1°150,286 
7 [Owlw |. | aorcoss | asotess | 1-aeicsen | -fasiose | Treenteep | aeaneae 
oo *497,025 “Ag 2 = 39% *748,225 "525,225 ‘006 
{6 ie 241,081 | -293,764 | 1°788,460 | -769,129 | 1-771,561 | 1-323,158 
Ye (@) | — 000,633 | — 016,551 | — 017,086 | — 004,935 | + 002,845 | 4+ °018,719 | +°017,641 
(b) | — 000,417 | — 014,160 | — -0182536 | — -007,468 | +-001,950 | + -019,060 |+-019,571 
(¢) | — 000,309 |— 011,471 | - -019,787 |— 010,293 | + :000,873 | + °019,302 | + °021,685 


The values («), (0), (c) refer respectively to 7=0°45, 0°50, 0°55. 
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Having obtained (fgs/n,9')? and vv for the trial values of 7, it 1s only a matter 
of adding vss /(7igy/Nsy)” for all values of s and s’ on the machine in order to obtain: 
tt = Sos {Vs¢¢/(Nise/' Mee)? 

The values obtained were: 


r= 0°45 | 0°50 0°55 
w=| +'157,074 | +:012,276 | — 209,976 


Whence by inverse interpolation* we find: 
u=0 for r, = °5034, 
which is “polychoric 7” as based upon Equation (xvii). We shall compare later 
the value for r as found by other processes. But the above value is clearly well 
in accord with the usual result for paternal correlation in man. 
Table V gives the working values of ryy//(Tiss:/Nss)”- 
TABLE V. Values of vss:/(Tisy Ns’)? F- 


s r $= | s=—2 so sa=:4 s=5) S=—0) s=7 


(a) |+:007,186 |+4-014,435 | — 022,001 


0 | -018,159 0 0 

1 | (b) |+ 006,010 | + -009,972 | — 031,958 0 | —*030,424 0 0 
(ec) |-+-005,071 |+-005,331 | — 049,232 0 — 050,132 0 0 
(a) |+-018,405 |+-141,463 | - -006,688 |— -021,475 | — 013,436 | — -011,685 0 

2 | (b) |+-013,851 |4+°140,141 |— 010,202 | — 027,086 | — 018,162 |— -016,986 0 
(c) |+:009,274 |-+ -139,495 | — 014,930 | — 035,275 |— ‘025,871 |— -026,399 0 
(a) |—-021,876 |+-001,794 | +-034,007 | + 020,301 | — 005,265 | — 024,509 | — 002,816 

3 | (b) |—-031,462 |+ -000,032 | + -038,425 |+ -022,755 | — 007,060 | — 032,660 | — 004,166 
(ce) |—-047,779 |—-002,425 |+ 


| 


002,890 | — °024,189 |+ ‘011,556 
4 (6) |--004,543 |— -030,027 | + °013,009 
(ec) |—+007,695 | —-038,318 | + -014,488 


— 
8 
Se 
| 


029,810 | + -010,036 | + 002,968 | — -010,247 
031,921 | + 010,316 | + -002,231 |~ -015,487 


at 
ee 
042,934 | 4+ 025,189 | — 009,544 | — -044,832 | — :006,579 
+ 
+ 
+ 034,010 | + °010,608 | + 001,398 | — 023,039 


(a) 0 — 030,326 | — 000,360 | + -019,498 | — 010,803 | + -017,610 | + 000,491 
5 | (b) 0 — 039,760 | — 001,075 | + 020,230 | + 010,914 | + -016,869 | — 001,027 
(c) 0 — 054,439 | — -002,126 | + -020,981 |+-011,058 | + -016,285 | — -002,916 
(a) 0 — 020,008 | — -022,838 | + -007,545 | + -004,686 | + -059,958 | + -012,415 
6 | (b) 0 — 028,474 |— -030,341 | + -007,029 | + 004,542 | + 056,264 |+-010,411 
(c) 0 ~ 043,219 |— -041,234 | + -006,516 | + 004,433 | + -053,151 | + -008,654 
(a) 0 — 017,910 | ~ 030,574 | — 002,307 | + “003,984 | + -014,391 | + -023,291 
7 | (b) 0 ~ 028,491 |— -044,067 | — -003,750 | +-002,606 | + -012,497 | +-019,449 
(e) 0 ~-047,576 | — -067,356 | — -005,755 | + 001,135 | + -010,895 |+ 016,389 


S(a)= +°157,074, S(b) = +'012,276, S (¢c)= — 209,976. 

* The formula used was Casus I or zg=2)+40 (Az_1 + Az) + $0262z9, the solution of the quadratic 
giving 0. 

t The table suggests, a posteriori, that we should have got quite reasonable results from linear inter- 
polation ; we have: from (a) and (b) r=*5042 ; from (a) and (c) r="4928, and from (b) and (c) r=*5025, 
as against our ‘5034. It should be noticed that the values in Table V are not always in agreement in 
the last figure with those obtained by dividing »,, in Table IV by the (M%s5-/msq’)? of that table, because the 
somewhat more accurate process was adopted of multiplying vss by n?,5, and then dividing by 7°,,. Still 


the physical meanings of 7igy/igy aNd (Nsg//Ngg)? ave SO prominent in the work that it seemed desirable 
to register their values. 
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Before we consider the graph due to this solution, let us investigate the value 
of + to be found from (xvi) The values of jigy/nsgy are already provided in 
Table IV, but we need a table corresponding to Table III giving the product 
%s7pX~T>’ instead of the product 93,7,)Sy7,. This is provided in Table VI. 
Further if 

Kgs = Nz LS Ty) +73 Lie (Mg ey SST Ne Ty + eens 
Table VII (p. 143) provides «,, for the same three values of 7, i.e. 0°45, 0°50 and 
0°55. Finally Table VIII (p. 143) gives Kgy/(Tisy/nsy’), Whence by summing we obtain 
Ui Sys {Kss'/(iss/ Mss ts 
for the three cases. 

Using the same interpolation formula as before in order to discover the value 

of r for which v = 0 we find: 
p= "9204: 

There is thus a difference of ‘0170 between the two methods. The probable 
error found for the product-moment 7 is ‘0160 and the result by the usual product- 
moment process may be given: 

r='5189 +0160. 

Thus either of the values reached by the methods of this paper differ by less 

than the probable error from the true product-moment value. 


(4) If we work out the results by mean square contingency we find: 
C, = 480,690, 
and the class index correlations are*: 
For fathers : Poe = 962,329. 
For sons: ro, = 964,523. 
Hence correlation from mean square contingency 
r= O/(res To,) = 5179, 
which is in excellent agreement with the product-moment value. 

It would therefore be quite reasonable for such a table as the present to use 
mean square contingency and class index corrections, and save the heavy labour 
of Equation (xvi bis) or (xvii). At the same time we cannot assert that this 
process would always be equally satisfactory for tables with but few broad 
categories and with much higher correlation. 

Our two processes seem to give values slightly in defect and in excess of the 
true value of 7, and we might use their mean, ie. 5118, to obtain our graph. We 
shall, however, first proceed to compare the actual results of solving (xiv) and 
substituting in (xv) with the result of such approximative processes. 

Table IX (p. 145) gives the products of 8, 7, Sy Ty’ and will therefore enable us 
by aid of Table IV (p. 140) which gives the values of fisy/nsy to obtain hs. for any 
value of 7. Let 

he =, 7 Se He EPS Sh Fe PSS ta eee (xxil). 


* Using the values of %, and 7, in Tables I and II respectively. 
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TABLE VI. 
Values of 3,15 3915. 


p geil s=2 3=38 oa ce s=6 si p 
O |+:005,966 |-+ °022,207 |+ 000,509 | — 007,686 | — -005,120 | — 011,028 |— 004,848 | 0 
1 }+:030,608 |+ 054,185 |+ -001,722 | + 010,633 | + -011,536 | + 037,126 |+ 025,890 | 1 
2 |+:054,736 |+ -014,342 | — 000,976 |+ -010,114 | + 000,145 | — 029,528 |— 048,833 | 2 
3 |+-035,199 | — -033,825 | — 002,246 | — -012,136 | — -010,270 | — 010,999 |+ 034,276 | 3 
4 |+-°002,562 |—-007,587 | + 000,169 | — -000,915 | + 001,407 | + 007,853 |— °003,489 | 4 
5 |+:005,180 | — -004,622 | — 000,915 | — -004,289 | — :002,583 | + 003,372 |+°003,856 | 5 
6 |+ 007,003 | — 011,727 |+:000,213 | — -000,013 | + 003,287 | + °008,380 |— 007,142 | 6 
O |+°022,842 |+ -085,018 |-+ 001,948 | — 029,426 | — 019,601 | — 042,222 |— "018,559 | 0 
1 |+:°056,787 | + °100,528 |+ 003,195 |+ -019,728 | + °021,402 | + 068,878 |+ °048,033 | 1 
2 |+-:017,375 |+ 004,553 | — 000,310 | + -003,210 | + ‘000,046 | — -009,373 |— 015,501 | 2 
3 |—'033,035 |+ -031,745 | + 002,108 | + -011,389 | + 009,639 | + 010,323 |— 032,169} 3 | 2 
4 | -—-007,358 |+ ‘021,786 | — ‘000,486 | + -002,627 | — 004,040 | — 022,548 |+ "010,019 | 4 
5 |—:004,261 |+ -003,802 | + :000,753 | + -003,528 | + ‘002,124 | — 002,774 |— 003,172] 5 
6 |—-011,913 |+-019,949 | — 000,362 | + -000,022 | — 005,591 | — 014,255 |+ 012,150 | 6 
O |+:001,324 |+-004,929 | + -000,113 |— -001,706 | — 001,136 | — 002,448 |— 001,076 | 0 
1 |+ 002,378 |+-004,211 | + 000,134 | + -000,826 | + -000,896 | + ‘002,885 |+ °002,012] 1 
2 |— 002,423 | — -000,635 | + 000,043 | — -000,448 | — 000,006 | + 001,307 |+ °002,162] 2 
3 | — 002,976 | + 002,860 | + 000,190 | + -001,026 | + 000,868 | + °000,930 |— °002,898 | 3 
4 |+ 000,365 | — -001,080 | + ‘000,024 | — 000,130 | + -000,200 | + °001,118 |— 000,497 | 4 
5 |—-001,271 |+-001,134 |+ 090,225 |+ -001,052 | + 000,634 | — -000,827 | - 000,946 | 5 
6 |+:000,475 | — 000,796 | + 000,014 | — -000,001 | + -000,223 | + 000,569 |— 000,485) 6 
0 |—-005,329 | — 019,833 |— -000,455 |+ 006,865 | + 004,573 | +-009,850 |+ 004,330] 0 
1 |+:006,217 |+ -011,006 | + 000,350 | + -002,160 | + 002,343 |+°007,541 |+ 005,259 | 1 
2 |+:008,127 |+ 002,130 | ~ -000,145 | + -001,502 | + 000,022 | — 004,385 |— 007,251] 2 
3 |— 007,217 |+ 006,935 |-+ 000,461 |+ 002,488 |+ °002,106 | + °002,255 | — "007,028 | 3 
4 |—-000,941 |-+ -002,786 | — 000,062 |-+ -000,336 | — 000,517 | — 002,883 |+°001,281] 4 
5 |—°002,847 | + °002,540 | + 000,503 | + -002,358 | + 001,419 | — 001,854 |— 002,120] 5 
6 | - ‘000,780 | + 001,306 | — -000,024 |+ 000,001 | — 000,366 | — -000,934 |+ 000,796 | 6 
0 |—:007,289 | — -027,130 | — 000,622 |+ -009,390 | + 006,255 | + °013,473 |+ "005,923 | 0 
1 |+ 014,662 | + °025,956 | + -000,825 |+ -005,094 | + 005,526 |+ 017,784 |+ 012,402 | 1 
2 |+ :002,953 | + 000,774 | — -000,053 | + 000,546 | + -000,008 | — 001,593 |— 002,635 | 2 
5 | 3 |-:013,673 |+-013,139 | + :000,873 |+-004,714 | + -003,990 |+ 004,273 |— 013,315 | 3 
4 |+ 001,057 |— -003,128 | + -000,070 | — 000,377 | + °000,580 | + °003,238 |— 001,489 | 4 
5 |-°004,116 |+ 003,672 |+ 000,727 |+ -003,408 | + 002,052 | — 002,679 | — 003,064 | 5 
6 |+ 003,320 | — -005,559 | + 000,101 | — -000,006 + :001,558 | + -003,973 | — 003,386 | 6 
0 |—-010,563 |— -039,314 |— -000,901 | + -013,607 | + -009,064 | + 019,524 |+°008,582 | 0 
1 |+ 033,046 | + 058,500 |-+ 001,860 | + -011,480 | + 012,454 | + -040,082 |+°027,952 | 1 
-2 .|—:021,493 | — 005,632 | + :000,383 |— -003,971 | — 000,057 |+-011,595 |+°019,175 | 2 
3 |—'013,795 | + 013,256 | + 000,880 |+-004,756 + :004,025 + -004,311 |— 013,433 | 3 
4 |+:006,142 |— -018,186 |-+ 000,406 | — 002,193 | + -003,372 |-+-018,822 |— 008,364 | 4 
5 |+°001,134 |— -001,011 | — -000,200 | — 000,939 | — -000,565 | + -000,738 |+ 000,844} 5 
6 |+ 008,563 | - -014,340 | + 000,260 | — 000,016 | + -004,019 |+-010,247 | — ‘008,733 | 6 
0 |—-006,952 | — 025,876 |— :000,593 |+ 008,956 | + 005,966 | + 012,851 |+°005,649 | 0 
1 |+ 024,867 )+ 061,193 | + °001,945 |+ 012,009 | + -013,028 | -+ 041,927 |+ 029,238 | 1 
2 |—°059,276 | — 015,532 | + 001,057 | — 010,953 | — -000,157 | + 031,978 |+ °052,883 | 2 
3 |+°035,498 | — 034,111 | — 002,265 | — -012,238 | — 010,357 | — 011,092 |+ "034,567 | 3 | 7 
4 |-—-001,827 | + ‘005,409 | — -000,121 | + 000,652 | ~ 001,003 | — 005,599 |+ "002,488 | 4 
5 |+°006,181 | — -005,515 | — 001,092 | — -005,118 | — -003,082 | + -004,024 |+°004,602 | 5 
6 |— 006,668 |+ 011,167 | — 000,202 | + -000,202 | — -003,130 | — 007,980 |+ 006,801 | 6 
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TABLE VII. Values of kgy. 
s' r s=1 2, s=3 s=4 s=5 s=6 s=7 
(a) |+:034,290 |+-045,918 |+000,873 | — -002,076 | — -000,798 | — 000,849 | — 000,094 
1 | (8) |+°039,786 |+ -047,855 | +-000,831 /— -001,549 | —-000,541 | — -000,496 | — 000,036 
(c) |+ 045,903 | + 049,468 | + 000,762 |— 001,097 | — 000,350 | — -000,251 | — 000,001 
(a) |+'048,425 |-+-135,200 | + :008,506 | — 018,688 | — -009,255 | — ‘013,278 | — 002,561 
2 | (6) |++050,671 | ++142,180 |+ 003,719 |— 017,061 |— ‘007,957 | — 010,554 |— -001,722 
(c) |+°052,617 |-+-149,704 | + -003,946 | — ‘015,291 |—-006,630 | — 008,054 | — -001,089 
(a) |+ 001,628 | + -006,926 | + -000,205 | — -001,317 | — 000,633 | — -000,765 | — 000,039 
3 | (b) |+:001,526 |+ 007,188 |+-000,223 | — 001,252 | — -000,545 | — 000,509 | + 000,040 
(ec) |-+-001,386 |4 007,465 | + -000,245 |~-001,175 | — 000,444 | — -000,235 | + -000,096 
(a) |—-001,641 |—+013,645 | —-000,278 |+ 008,425 |+ -005,826 |+ 012,401 |+ 004,608 
4 | (b) |-:001,250 |—-012,657 | — 000,247 |+ °008,726 |+ 006,019 |+ 012,553 |+ 004,294 
(c) |— 000,903 | — 011,563 | — -000,211 |+ 009,071 | + 006,233 | + 012,663 | + 003,892 
(a) |— 001,344 |— -014,202 |—.000,165 |+-012,270 |-+ 009,181 |+ 021,659 | + -009,613 
5 | (b) |—:000,940 | — 012,484 |— 000,085 | + 012,745 |+ 09,643 |+ 022,682 |+ -009,562 
(c) |—+000,625 | —-010,689 |+ -000,007 |+ -013,278 |+ °010,160 | + 023,755 | + -009,352 
(a) |—-000,958 | — °018,805 |+ 000,109 |+ 018,295 |+ 015,185 |+-041,172 |+ -023,419 
6 | (b) |—-000,584 | — -011,207 |+ -000,258 | + -018,782 |+ ‘016,036 | + 044,362 |+ -025,040 
(ec) |—+000,328 |—-008,749 |+ -000,419 |+-019,263 |-+ 016,957 |-+ 047,837 |+ 026,557 
(a) |—*000,047 | —:004,380 | + -000,263 |+°010,961 |+-010,729 |+-036,961 |+ :032,908 
7 (b) |-+000,076 | — 003,086 | + 000,316 |+°010,574 | + -010,938 | + -040,073 | 4+ 038,215 
(ec) |+000,159 | — 002,067 | + 000,348 |+ °010,019 | + 011,027 | + -043,208 | + 044,126 
TABLE VIII. Values of tes:/(Fisg/ se’). 
s’ r s=1 S12) so s=4 s=5 s=6 si 
(a) |+°021,394 |+ -052,182 |+-001,072 ) — 002,057 ) ) 
1 | (6) |-+-0213559 | +-053-054 |+-001,178 ) — ‘002,033 ) 0 
(ce) |-+°021,698 |-+-053,980 | + -001,296 ) — 022,011 ) 0 
— | - — 
(a) |+°055,831 |+°149,303 + -003,713 | — ‘012,640 |— -006,711 |— -007,035 ) 
| 2 | (6) |+°056,643 | +°150,514 |-+ 003,957 |— 012,331 | — 006,547 | — 006,833 0 
(ec) |+-057,497 |-+ 151,686 | + 004,228 | — -011,960 | — 006,354 | — -006,638 0 
(a) |+:001,898 |+ 006,439 |+ 000,185 | - ‘001,621 | — 000,741 | — ‘000,777 |—:000,018 
3 | (b) |+-002,029 | +-006,679 | + -000,196 | — -001,521 | — 000,645 | — 000,547 | + -000,022 
(ec) |-+-002,180 |-+-006,941 | +-000,209 | — -001,406 | — 000,534 | — -000,271 |-+ 000,065 
(a) ~ -001,004 — °011,805 |— :000,258 |+-010,415 | + 006,850 |+ -010,117 |+ -006,859 
4 | (b) |—-000,992 | — -011,538 | — -000,225 |+-010,424 |+ -006,858 | + °010,124 |+ -006,845 
(c) | — 000,985 | — 011,217 | — 000,188 |+ 010,432 |+ 006,869 |+ -010,130 |-+ 006,828 
(a) ) — 012,745 | — 000,160 |+ 013,131 |+ 009,558 |+ 023,162 |+ -010,155 
5 | (b) i) — 012,407 |—-000,083 | + 013,153 |+ 009,564 |+ 023,170 |-+ 010,125 
©) ) — °012,005 |— 000,007 |+ 013,177 | + °009,569 | + °023,173 |+ °010,084 
(a) 0 — -009,447 |+ :000,126 |4+°015,039 |+ -009,465 |+ -058,655 |+ 025,835 
6 | (0) 0 — 009,156 | + -000,311 |+°015,079 | + 009,468 |-+ °058,760 |+-025,838 
(ec) 0 — 008,854 |+ 000,533 | + ‘015,123 | + -009,468 |-+ 058,853 | + “025,824 
(a) 0 2 --004, 556 |-+ 000,352 |+ 007,494 + -012,697 |+ ‘032,408 |+ °037,813 
Henle) 0 —-004,377 | + 000,487 | + -007,493 |-+ 012,645 | + -032,448 | + -038,095 
(c) ) = 004,210 | + -000,642 | + “007,492 | + -012,574 |-+ -032;463 | +-038,361 
S(a)=+°510,573,  S(b)=+°517,476, 8 (c)= +°524,735, 
Y= -- 060,573, v= — 017,476, = +:025,265. 
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TABLE IX. 
Values of 3.1, Sy Tp 
p s=1 s=2 s— | s=4 s=5 s=6 Si) p 
0 |—:002,689 | — :010,007 | — 000,229 | + -003,464 | + 002,307 +:004,970 | + °002,185 | 0 
1 |—:013,450 | — -023,810 |—-000,757 | — 004,673 | — :005,069 |— 016,314 |—°011,377] 1 
2 |—'023,067 | — -006,044 |+ 000,411 | — 004,262 |— -000,061 |+-012,444 |+°020,57 2 
3 |- 013,496 + -012,969 + 000,861 | + -004,653 + 003,938 | + 004,217 |— 013,142 | 3 
4 |—+000,450 | + :001,333 | — 000,030 |+-000,161 —*000,247 — -001,380 |+-000,613 | 4 
5 |—:003,007 | + -002,683 |+°000,531 |-+-002,490 + 001,499 | —-001,958 |— 002,239 | 5 
6 |— 003,082 |+ :005,161 |— -000,094 |+ 000,006 |— ‘001,446 |— 003,688 |+ 003,143 | 6 
0 | —*023,802 — :088,590 | — :002,030 | + 030,662 + °020,425 + 043,996 |+ 019,339 | 0 
1 |—°051,494 |— 091,158 |- -002,898 | — ‘017,889 |— -019,407 | —-062,458 | — "043,556 | 1 
2 |—+002,940 | — :000,770 | + -000,052 | — 000,543 — -000,008 | + 001,586 | + :002,623 | 2 
3 |+ 036,379 | — 034,958 | — 002,322 | — 012,542 | — ‘010,614 | — -011,368 |+ -035,425| 3 | 2 
4 |+004,781 |— :014,154 |-+-000,316 | — 001,707 + °002,625 |+ -014,650 — -006,510 | 4 
5 |+:'007,797 | -- :006,957 | — 001,378 | — 006,456 | — :003,887 | + 005,076 |+ °005,805 | 5 
6 |+:009,448 | — 015,822 |-+ -000,287 | —-000,017 + :004,434 +-011,306 —-009,636 | 6 
| 
0 |— 022,458 | - ‘083,587 | — 001,915 | + -028,931 |+°019,271 + °041,511 |+ 018,247 | 0 
1 |— 002,986 | — :005,286 |— 000,168 | — 001,037 |— 001,125 — -003,622 |— 002,526} 1 
2 | +045,338 | + 011,880 | — 000,808 | + -008,377 | + °000,120 | — -024,459 |— 040,449 | 2 
3 +°003,681 | — °003,537 | — -000,235 | — -001,269 |— 001,074 —-001,150 + 003,584 | 3 
4 |--007,651 |+ °022,655 | — -000,506 |+ :002,732 |— 004,201 | — ‘023,447 |+ 010,419 | 4 
5 |+°001,548 | — -001,381 | — 000,273 | — 001,281 |—*000,772 +°001,007 |+-001,152| 5 
6 |—:011,412 |+°019,111 | ~ 000,346 + -000,021 |— 005,356 |— 013,656 +°011,639 | 6 
0 |—*010,833 |— °040,322 |— -000,924 | + 013,956 |+ 009,296 + -020,025 + -008,802 | 0 
1 |+ 012,013 |+ ‘021,267 | + -000,676 |-+ -004,173 |+ 004,528 +°014,571 |+°010,161 | 1 
2 |+:017,109 |+ ‘004,483 | — -000,305 | + -003,161 |-+ °000,045 | — 009,230 |— -015,264] 2 
3 |—°014,068 |+ 013,518 |+ -000,898 | + -004,850 |+ °004,105 +:004,396 |— -013,699 | 3 
4 |— 002,101 |+ 006,221 | — :000,139 | + 000,750 |— -001,154 — 006,439 + -002,861 | 4 
5 | —:005,604 |+ 005,000 | + -000,990 |+ 004,640 |+ °002,794 |— 003,648 |—-004,172} 5 
6 |—-001,988 |+ 003,329 | — 000,060 | + -000,004 | — 000,933 | — 002,379 |-+-002,028| 6 
= = ae | — = 
0 |—:008,303 | — :030,904 | — 000,708 |+-010,696 |+°007,125 + °015,347 |+°006,746 | 0 
1 |+ 016,433 | + 029,090 |+ 000,925 |+ 005,709 + :006,193 + :019,931 + °013,899 | 1 
2 | +:003,809 | + 000,998 | — 000,068 |+ 000,704 + 000,010 | — 002,055 | — 003,399 | 2 
3 | —°015,502 |+°014,897 |+ ‘000,989 |+ 005,344 |+4 -904,523 |-+-004,844 |— 015,096 | 3 
4 + :001,087 | — 003,220 | + 000,072 | — 000,388 |+ 000,597 |-+-003,332 |— 001,481 | 4 
5 |—004,744 |+ -004,233 | + 000,838 | + -003,928 |+ 002,365 | — 003,088 |—*003,532 | 5 
6 | + 003,592 |-- 006,016 | + -000,109 |— 000,007 | + °001,686 | + -004,299 |— -003,664) 6 
0 |—-007,749 | — 028,843 | — -000,661 |+-009,983 + -006,650 + °014,324 |+°006,297 | 0 
1 |+°023,811 |+-042,152 | + ‘001,340 |+ 008,272 + °008,974 |+°028,881 |+-020,140} 1 
2 |—-014,636 | — 003,835 |-+ -000,261 | — -002,704 — -000,039 | + °007,896 |+°013,057 | 2 
3 | —:010,657 | +-010,241 |+ -000,680 |+ 003,674 |+-003,110 | + -003,330 |— 010,378 | 3 
4 | +4 °004,372 | —-012,946 |+ :000,289 |—-001,561 + :002,401 | + °013,399 |— 005,954) 4 
5 |+ 000,442 | —-000,394 | — -000,078 | — 000,366 — -000,220 | + °000,288 | + °000,329 | 5 
6 + 006,335 — -010,608 | + °000,192 | — 000,012 + °002,973 | + 007,580 | — 006,461 | 6 
| | | 
0 | — 003,242 | — 012,067 | — ‘000,276 }+ 004,177 +°002,782 | + °005,993 |-+ 002,634] 0 
1 |+°015,673 |+ 027,746 |+ 000,882 |+ 005,445 + -005,907 | +-019,010 |+-013,257] 1 
2 |—-025,614 | — 006,712 |+ 000,457 | — 004,733 — -000,068 | + -013,818 |+ "022,851 ] 2 
3 |+°013,663 | — 013,130 |— -000,872 | — 004,711 = 003,987 -- 004,270 |+ °013,305 | 3 
4 |—:000,037 |+ -000,111 | — -000,002 + -000,013 | — -000,020 | — 000,114 |+°000,051 | 4 
5 |+ 003,569 | — -003,184 | — 000,631 | — -002,955 | — 001,779 | + -002,323 | + 002,657 | 5 
6 |— 002,893 | + :004,845 | — 000,088 | + -000,005 | — :001,358 — :003,462 |+-002,951] 6 


— 
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TABLE X. 
Values of Sst Yelp. 
| 
s’ p | ja | BS s=3 s—4 i) s=6 s=7 p Sf 
0 |~-002,716 —-024,296 | — 019,919 |— -013,581 | — 005,206 | — 007,621 |- :002,113] 0 
1 |= °013,578 |— -050,535 | — 001,157 | + 017,491 |+-011,650 |-+-025,097 |+ 011,032] 1 
2 —+023,244 + -001,049 | + 041,494 | + 019,288 | + -000,298 | — 018,826 | — 020,060] 2 
1 | 3 | —-013,520 |+ 038,284 |-+ -001,489 | — 020,306 | — -010,442 | — 008,529 /+ 013,024] 3 | 1 
4 — -000,364 | + -004,567 | — 007,904 |— 002,108 |-+ -001,391 | + 005,282 |— 000,864 | 4 
5 — 002,999 | + -008,296 | + 000,596 |— -007,320 | — 002,654 | + 001,795 |+ 002,285 | 5 
6 = -003,074 |+ -008,906 | -011,032 — “000,956. + -003,290 | + 006,016 |— 003,149 6 
| 
0 —-010,399 | — -093,014 | — 076-260 | — 051,995 | — 019,931 | — 029,175 | — 008,088 | 0 
1 | ~*025,191 | — 093,756 | ~ -002,147 | + 032,451 | + 021,614 |+ 046,563 |+ "020,467 | 1 
2 | —:007,378 | + -000,333 |+°013,171 | + -006,123 | +-000,095 | — 005,976 | - 006,367] 2 
2 | 3 +:012,689 | — -035,930 | — 001,397 | + -019,057 | + 009,800 | + 008,004 |—-012,223] 3.] 2 
4 4°001,044 — -013,114 | + -022,696 + -006,054 | — -003,993 | — 015,167 |+*002,480| 4 
5 |+ 002,467 —-006,824 | — 000,490 | + 006,021 | + 002,183 | - 001,477 |— 001,880 | 5 
6 + °005,229 — -015,149 | + 018,767 + -001,627 | — ‘005,596 | - 010,234 |+ 005,357] 6 
0 |—+000,603 | — -005,392 | — 004,421 |— -003,014 | —-001,155 | - 001,691 |— 000,469 | 0 
1 |= "001,055 | — 003,927 | — -000,090 | + 001,359 |+-000,905 |-+ -001,950 |+ 000,857 | 1 
2 |-+ 001,029 | — -000,046 | — :001,837 | — 000,854 | — -000,013 | + 000,833 |+ 000,888 | 2 
3 | 3 |4+:001,143 | —-003,237 | — -000,126 |+ -001,717 |+ 000,883 |+ 000,721 |— 001,101] 3 | 3 
4 |—-000,052 | + 000,650 |— -001,125 | — -000,300 | + -000,198 | + “000,752 |— 000,123 | 4 
5 |+:000,736 |—-002,035 | — 000,146 | + -001,795 | + 000,651 | — 000,440 |— 000,561 | 5 
6 | —*000,209 | + -000,605 | — -000;749 | — 000,065 | + 000,223 |+ -000,408 |— -000,214 | 6 
0 | + 002,426 | + -021,699 | + 017,790 | + -012,130 | + 004,650 | + 006,806 |+ -001,887 | 0 
1 |= 002,758 | — -010,265 | — -000,235 | + 003,553 | + 002,366 | + -005,098 |+ 002,241] 1 
2 | —-003,451 | + -000,156 | +-006,161 |+-002,864 | + 000,044 | — -002,795 |— 002,979 | 2 
4 | 3 |+-002,772 | —-007,849 | — 000,305 |+ 004,163 |+ 002,141 | + -001,749 |— 002,670] 3 | 4 
4 | +000,134 | — -001,677 | + -002,902 | +-000,774 | — ‘000,511 | — 001,939 |+ -000,317| 4 
5 |+ 001,648 |— -004,560 | — 000,328 | + 004,023 |+ 001,459 | — 000,987 | 001,256 | 5 
6 | +000,342 | — -000,992 |-+ °001,229 + -000,107 | — :000,366 | — 000,670 |+ 000,351 | 6 
0 |+°003,318 | + -029,682 | + 024,335 | + °016,592 | + -006,360 | + -009,310 |+-002,581 | 0 
1 |— 006,504 | — -024,207 | — 000,554 | + 008,379 | + 005,581 | + 012,022 |+ 005,284 | 1 
2 |— 001,254 |+ -000,057 |-+-002,239 | + -001,041 | + -000,016 | — -001,016 |— -001,082 |} 2 
5 | 3 |+ 005,252 |—-014,872 — -000,578 | + 007,888 | + 004,056 | + 003,313 |— 005,059] 3 | 5 
4 |— 000,150 |+-001,883 | — -003,259 | — 000,869 | + 000,573 | + “002,178 |—-000,356 | 4 
5 | + 002,383 |—-006,592 | — 000,474 |+-005,816 | -+-002,109 | — -001,427 |- 001,816 | 5 
6 |—*001,457 | + -004,222 |— -005,230 |— 000,453 + 001,560 | + 002,852 | — 001,493 | 6 
| ae |e 2 — 
Q |+°004,809 +4 -043,011 | +°035,264 + -024,044 + -009,217 |+ -013,491 |+-003,740 | 0 
1 |— 014,659 | — 054,559 | — 001,249 |-+ 018,884 + -012,578 | + 027,096 |+ 011,910} 1 
2 |+ 009,127 | — -000,412 | -- -016,293 | — 007,574 — 000,117 |+ 007,392 |+-007,876 | 2 
6 | 3 |4 005,299 | —-015,004 | — 000,583 | + -007,958 + 004,092 | + -003,342 |— 005,104) 3 | 6 
4 |—:000,872 + -010,946 | — -018,945 | — -005,054 + 003,333 | + ‘012,661 |— 002,070} 4 
5 |—-000,656 + -001,815 | + 000,130 — 001,602 —-000,581 | + 000,393 |+ 000,500 | 5 
6 | — 003,758 + -010,889 | — 013,490 | - 001,169 + 004,023 |-+ :007,356 |— 003,851 | 6 
0 |+-003,165 + -028,310 | + -023,211 | + -015,825 + -006,066 | + 008,880 |+ 002,462 | 0 
1 |—:015,334 |— -057,071 | — 001,307 |+ 019,753 | + 013,157 | + 028,344 |+ 012,459 | 1 
2 |+-025,172 — -001,136 | — -044,936 | — -020,888 | — -000,323 | + 020,387 |+ 021,724 | 2 
7 | 3 |—-013,635 + -038,608 |+-001,501 | — -020,478 | —-010,531 | — -008,601 |+ 013,134] 3 | 7 
4 + 000,259 — -003,256 |+-005,635 |-+-001,503 —-000,991 |— -003,766 |+ 000,616 | 4 
5 |--003,579 + -009,899 | + 000,711 | — 008,734 | — -003,167 | + “002,142 | + 002,727 | 5 
6 |+-002,927 — -008,480 | +-010,505 |-+ -000,911 | — -003,133 | - 005,729 | + 002,999 | 6 
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We shall proceed to calculate A, for three values of 7 which lie near the 


probable value of r as found from each column. We will take these as °45, °50 and 
‘55; from these values we shall obtain h,. for each column from (xiv) and inter- 
polating the real f,. between them find the corresponding columnar r, which will 
be then substituted in (xv) by aid of Table X to obtain the columnar mean /;,.. 
Table XI gives the values of Ay, for r=°"45, 50 and ‘55, and Table XII the 
resulting values of he : 


TABLE XI. Values of Axy for r='45, 50 and ‘55. 
r s=l1 s=2 ==) s=4 SO) s=6 s=7 
(a) |—-014,742,01 | — 020,616,58 —°000,400,18 +000,974,70 + ‘000,377,97 | + :000,409,54 | + -000,044,95 
(b) |—-017,038,00 | —-021,554,08 | — -000,383,88 + -000,731,59 | + -000,258,31 | +-000,246,06 | +-000,015,95 
(c) |—-019,587,49 | — -022,373,22 | — -000,356,40 +-000,518,95 | +-000,168,60 | + -000,136,31 | — -000,003,29 
(a) |—-043,836,23 | — -133,792,74 | — -003,545,25 + -021,169,83 | + -010,795,76 | +-015,963,45 | +-003,258,21 
b) | —-045,046,53 | —-140,080,50 | —-003,775,08 | +-019,705,30 | +:009,504,62 | +-012,993,41 | + -002,268,84 
’ ) Oy a] | 
(ce) |—*045,869,07 | — 146,859,24 | — -004,026,98 | +°018,090,53 | + ‘008, 150,14 | + -010,141,50 | + :001,500,21 
(a) |-—:014,665,26 | — -082,820,10 | — 002,204,29 | + °030,133,27 | + °018,460,19 | + °033,767,07 | 4+--009,791,12 
(b) | - -012,764,50 | —-082,030,73 | — 002,275,94 | +-030,479,17 | +:018,233,88 +-031,794,16 | +-008, 188,80 
(ce) |—-010,711,23 |—-080,956,49 | —-002,360,54 | + -030,869,67 | +-017,938,38 + -029,455,85 | + -006,551,72 
eee x el ee = eae 
(a) |—-003,450,60 | — -028,237,21 | —-090,587,66 | +-017,032,32 | +-011,713,27 + -024,762,35 | + -009,092,34 
(b) | —-002,645,25 | — -026,280,92 | — 000,528.69 +-017,630,94 | +012,084,98 + -024,998,89 | + 008,434.25 
(c) |—-001,920,26 | —-024,106,94 | — -000,459,56 | +-018,316,53 | +°012,492,17 + -025,139,70 | + -007,601,99 
(a) | —-001,562,59 | — -016,357,80 | —-000,196,08 | + :013,951,10 | +-010,408,16 | -+-024,456,57 | + -010,780,30 
b) |—-001,096,19 —-014,410,34 | —-000,106,48 | + 014,492.89 | +-010,926,94 | + 025,583,17 | +:010,698,56 
] ’ ) 7 Sj ? | 
(ec) |— 000,731,638 | —-012,372,26 | — -000,003,49 | +°015,100,01 | +°011,507,01 feb ipl o2 + °010,435,95 
(a) |—-000,728,92 | — -010,344,20 | +-000,068,82 | -+013,421,77 | -+-011,082,88 | +-029,840,53 | +-016,766,62 
b) | —-000,448,58 — -008,432,81 + -000,177,88 | +:013,793,06 | +-011,705,64 | +°032,119,62 |+-017,871,20 
) b] ? > t | irs | 
(e) | —°000,255,73 —:006,613,75 | + -000,295,92 + °014,164,31 + °012,382,26 + 034,601,52 | + °018,889,99 
ts | =| S — =| | a 
(a) |—-000,090,63 | — -002,150,92 | + -000,121,52 + -005,185,57 | 4+-005,018,14 | +-016,965,98 + -014,515,02 
(6) | —-000,037,11 — -001,530,11 +-000,149,03 | +-005,035,92 | + (005, 142,06 + -018,430,12 | +-016,770,70 
(ec) |—-000,000,75 | — -001,037,56 | +-000,167,89 + :004,808,83 -+005,217,99 | + -019,928,67 | +019,271,47 
| | | 
TABLE XII. Values of hy. for Columns. 
r Sa c= 2 Ste sa=4 s=5 | s=6 s—i) 
“45 — 2°19299 |— -92114 | — 02549 | + -56650 | + ‘98336 |4+1°45054 | +2°30569 
50 —2°18505 | — °91848 | — 02538 |+ °56616 | + °98315 +1°44900 |+2°29880 
515) — 2°17567 | — 91485 | — 02521 |+°56580 | + 98286 | + 1°44685 | 4+ 2°28999 
= | =€ | 
Actual Ibe — 2°19667 |—-91404 | — 02553 |+ °56594 | + 98333 |+1°44723 |42°29464 | 
BESIES Os 4229 | +5599 | -4167 | -5309 | -4585 5426 5249 
| Interpolated r | 3 : ; Si hae 
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We have thus the values of r found from each column*. 


We now turn to Table X and calculate in exactly the same way the values of 


Neg = ete Ty ere tay Ly ee. a PE et Sy lias ow eee, 


for the r peculiar to each column for that column. We thus obtain Table XIII. 


TABLE XIII. 
Values of r’sy for r of each Vertical Column. 


3’ Sil s=2 s=3 s=4 s=5 s=6 s=7 

1 — ‘013,707,54 | — :044,362,34 | — -013,376,98 | — 002,394,74 | —-000,770,04 | — :000,212,73 | — 000,006, 10 
2 | —'021,315,40 | —°153,841,07 | — (074,192,36 | — -029,418,02 | — :009,240,63 | — :006,036,02 | — ‘000, 641,41 
3 | — 000,771,58 | — 008,202,77 | — -004,826,27 | — 002,225,87 | — ‘000,633,67 | — -000,217,60 |+ -000,030,11 | 
4 | +:000,880,64 | + °014,176,58 | + °018,829,61 | + °015,680,02 | + °005,954,01 | + °008,797,09 |+ -001,837,83 
5 |+:000,759,52 | +°013,488,41 | + °024,318,82 | + -022,680,28 | + :009,395,75 | 4+ -016,257,72 |+ -004,194,22 
6 | +:000,584,54 | -+°011,211,78 | + :031,232,08 | + -032,630,32 | + °015,526,73 | + :032,207,15|+ ‘O01 1,205,66 
7 |+:000,127,47 | + 002,739,883 | + °015,206,16 | + -017,131,65 | 4--010,878,46 | + 028,515,80 | + 017,104,71 
lege | — 963,72 — 507,66 — ‘010,29 + °303,04 + 425,95 +°789,11 + 1:238,75 


The values in Table XIII divided by 7i,./nsy from Table XIV and summed for 
each column give, on multiplication by N/n,., the i. of the last row of the table. 


To obtain Table XIV we must return to Equation (x), use the appropriate r for 
the column and the values in Table III of 3,7, Sy, 7,’.. Taking o, and o, as units 
of the horizontal and vertical variates we can plot /,. in Table XIII to h;. from 
Table XII and so obtain the regression line as formed by the means of each column, 
and set against it the regression lines as found from polychoric 7, = ‘5034, or ‘5204. 


TABLE XIV. 
She i 
Values of —* for columnar Values of r. 

Ns 
Sf s=1 s=2 s=3 s=4 s=5 s=6 s=7 
1 | 1-481,160 | -918,247 | 881,901 | ee ‘366,258 ee) ee 
2 | +850,270 | -995,746 | -946,290 | 1:319,975 | 1°351,776 | 1:261,565 oo 
3 | -910,673 | 1:075,087 | 1-090,803 | 830,945 | 853,291 | -876,250 | 1-668,232 
4 | 1:846,116 | 1:016,776 | 1-:066,432 | 856,640 | 855,012 | 1:248,823 | -600,417 | 
5 2 | :866,469 | 1-:028,801 | -992,395 | -968,288 1°018,112 | -937,952 
6 2 | 980,885 | 887,677 | 1-263,099 | 1-619,040 | -803,917 | -999,089 
a a0 454,171 808,851 | 1°368,315 848,918  1°316,691 | 1:074,517 


* The mean value of » weighted with the column totals is 
(i.e, within the probable error of) the results on p. 142. 


+5022 which is in reasonable accord with 


“ 
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This is done in Diagram I. But what we actually desire is to compare the obser- 
vations and the regression lines as given by the present polychoric method with 
those obtained by product-moment methods. 


Stature of Son in Inches. 


es ee eS SES 
-2 | oO +) +2 


Stature of Father in Inches. 


Diagram I. 


Our actual data from which the table on p. 135 was obtained are given in 
Table XV. The following are the values of the constants in inches : 


Mean Stature of Father: % = 67-878. 

Mean Stature of Son: 47 = 68'"845. 

Standard Deviation of Father: o, = 2’"6576. 
Standard Deviation of Son: — a, = 26885. 
Correlation of Father and Son: 7 =°5189 + :0160. 


In Diagram II the regression line (slope, 5245) with means of the arrays as 
dark circles is given. Against this we have put as hollow circles the values of 
h;. and k,. multiplied by their respective s.p.’s to indicate the result as worked 
out in the present paper. The closeness of the polychoric coefficient 5204 and 
the product-moment coefficient does not permit of two regression lines being 
drawn. It will be seen that the fit to the observations by use of broad categories 
and the polychoric method is really quite as satisfactory as the fit by the product- 
moment method. But the amount of arithmetical work is incomparably greater 
by the former, even if it be less than Ritchie-Scott’s process with 49 cells would be. 


Accordingly we now proceeded to investigate the extent to which approxi- 
mations shortening the arithmetic would introduce serious error. The first question 
to be answered is: To what extent in finding the means k;. of the arrays is it 
needful to use the actual value of the correlation coefficient as found for each 
column? In order to test this we proceeded to find the k.. for each columnar 


‘TI Weiser 
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Stature of Son. 


Kart PEARSON AND Econ S. PEARSON 151 
TABLE XV. 
Correlation of Stature in 1000 pairs, Father and Son. 
Stature of Father. 

TiS eae eee i [ees cae acti leet. elpastadile cea sital 

BlSlelefelsef{elslelsef;sese;e|/s}ye fe] s 

© | | | |S |S | | | S| & | | & | H | & | & | & | & | Totals 

yee eSellaee | saee ess iP ss] ime | 5. || se | So foe & Hl hl & |i] & 

ote) ie) | S Ne) ie) ie) Ne} Ne) Ne) ie) ton ~ to ~ | ~~ NN 
Oo eal ee | | | ae 2 
60875 | ae mers ee ea ee n 
Bie ae =|}, I Na he ei VS A Ua a easy ee 6 
Cs pee Oe aah Sale Po Sl Ole bse Pee ee) ee 90 
Coat We ciel | 6h S|) Oo | 9 | 8 | 1 heart, ero 
64875-12131 3] 5] 11 | AICO, We | ede | ale epee ae en eh a 70 
uso |) de) 2 6 | “9 | 10/20) 17)15| 7) 6 ey) ey! 
66"°875—| 1|—| 6| 4] 11] 24] 21 | 28}10}/ 12] 7] 4] 1 129 
67"875—| — | 2 | 2] 7] 9 | 20| 16 | 33 | 27 | 26) 20] 13]; 6| —|—j|—]|—|] 181 
CoO oe | alt 1s) 4.) py 19/13 | Jo | 92 | 96 | 24.) 6) 9/2! 1 |—| —F 125 
69 s76— | — | — | — | — | 6] 11) 15/18} 18 | 23) 18/13) 4) 4/1 | 1 | —7 131 
10" 875— | eRe Aun oe Lon ilsalbe sei |e) Wale eed = 80 
VS pe Wes Potala yay | Gnleeor Okay Wasi tT (aw Sky 57 
ee eo Oo 18h deh ONS On Te ey a Oe 36 
VE Sis) | SS) SS SS a al by |) abu Sieh yy) ae Se 21 
Wiycey 5 |) — ee ee rhe eee Se) OM one sar eo Pope fee | SS 8 
i pee Se | ee gy | ee Pf 1 
GO”"S7G= || = | SS he SSS SS SS SS SS SSS SS SS SS a a eS 2 
CPO || a a a ne es ee ene ee ee ee 3 
Oe) | SS Sh | |} iad ea (San ea ey a 1 
Totals | 7 36 | 63 | 109 fant 139 | 125 1000 


array for the same correlation coefficient, and we took for the value of that 


coefficient ‘5000, somewhat under the value found by either polychoric coefficient. 


Table XVI gives our results. 


It involved finding a new series of values for 


N'sv, but those for figy/nsy have already been computed under (b) in Table IV. The 
results are given in terms of inches. 


TABLE XVI. 


Columnar Means by Different Processes. 


ks. X Ty kg, X Oy Ie. X Cy 
| 8 hs. X Oy Common base 
| Hach column | Each column | Each column 
its own r for r=-50 assumed Normal 
1 —5'8379 —2°5881 — 26498 — 2°4809 Wy) 
2 — 2°4292 — 1°3633 — 1°3531 —1°4357 BY 
3 — ‘0678 — ‘0276 — ‘0176 — ‘O701 By 
4 +1°5040 + ‘8138 + ‘8087 + °7632 3h 
5 +2°6133 | +1°1439 +1°1511 +1:0866 344’ 
6 .+3°8462 | 42:°1192 +2°1122 +2°1744 4! 4-5’ 
ai +6:°0982 +3°'3267 +3:°3194 +3:°1109 5’ 
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An examination of the fourth column of Table XVI shows us that we have 
not for practical purposes seriously modified the columnar means by using 7 = °50 
instead of the individual value for each column. This is illustrated in Diagram III, 
where except in the case of the first array there is hardly daylight between the 
two series of points. 


Stature of Son in Inches. 


Stature of Father in Inches. 


Diagram ITI. 


In Diagram III the hollow circles give the means with 7 obtained for each column, the nearly 
superposed dark circles the means with r= ‘5000. 


The solution of the problem therefore falls back on Equations (x), (xvit) and 
(xv). We should still have to calculate $,7,, 35 T »’, Ss Tp and 9; T,,’, but we should 
only need the three series of products 35 Tp Sy Ty’, 3s Tp Sy Tp and 95 Tp Sy Ty’, and 
to obtain k,. it would be adequate to use a value of r for which fisy/n.y had been 
found for the final interpolation. Still this involves very lengthy arithmetic, and 
we naturally crave for a still easier process. The present full working out of a 
numerical example enables us for the first time. really to test the adequacy of an 
easier method of dealing with such polychoric tables which has been long in use 
as an approximate method in the Biometric Laboratory. 


(6) It is clear that if we could find the means of the columnar arrays, we 
could readily obtain the correlation and the regression line by aid of the correlation 
ratio corrected for class index. The whole problem accordingly turns on a ready 
means of reaching—at any rate—an approximate value of the mean of a columnar 
array. This array is the slice between two parallel planes of a normal correlation 
surface, 


In the ease of a surface of zero correlation 


Z= Dye BAY 
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the slice between XY, and _X, has for its volume on dY 
Xa _4Xx2q? ei ee 
| g Tle ax ge ae 


the slice is therefore given by the normal curve : 


: —4Y?/a? 
Ordinate =const. xe ” ioe 


It seems therefore not unreasonable after the surface of revolution is stretched 
and slid into a correlation surface to assume the slice to be still approximately a 
normal curve. Unfortunately the determination of the best mean and standard 
deviation for normal material given in broad categories does not admit of very easy 
solution. What we need is the difference between the means of a columnar array 
and of a marginal frequency as a multiple of the standard deviation of the latter. 
We shall obtain results differing more or less from each other according to the 
individual broad category we take as the basis of comparison between o, the 
standard deviation of the sth slice and o, the standard deviation of the marginal 
frequency. In fact the range of any broad category or of any combination of broad 
categories, except the tail categories, can be made a means of linking up o, and ay. 
A little experience, however, shows (a) that it 1s undesirable to find the o, of 
any array from a category of small frequency, and (b) that for arrays of small total 
frequency symmetrical tripartite divisions as far as feasible are the best*. The last 
column in Table XVI shows the system selected for each of our columnar arrays. 


Take, for example, s = 5, the columnar array may be taken on the base of 3’ 
and 4’ categories as 


+2’ 9 335 
3+4 364 and compared with 2421 
5+ 6+ 7’ 24, 244, 
Totals 69 1000 


as the corresponding marginal distribution. The corresponding proportional 

frequencies up to the dichotomic planes are : be and ie . The distances of the 
6521 ‘7560 

meant from the two dichotomic planes in the first case are 


—1:12456, and + °39106,, 
and in the second case 


— 4261o, and + °6935c,, 
where o, is the standard deviation of the normal curve assumed to represent the 
columnar array 5. Accordingly the range of 3’ + 4’ categories 
= 1:51550,=1:1196¢,, 
which gives o; in terms of o,. 


* The probable error of a standard deviation found in this way is discussed in Biometrika, Vol. x11. 
p- 129. 
+ Found from the Probability Integral Table. 
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Hence the distance between the means is 

69350, — 39106; 

{6935 — 3910 x 1:1196/1:5155} o, 
= ‘4046¢, 


= 1:0866, if we introduce the value of oy. 


This and the corresponding values are recorded in the fifth column of 
Table XVI. It will be seen that these values approximate to those in the third 
column, the greatest differences being in the small first and last arrays. 


Of course in actually working with material solely given in broad categories we 
use the value 4046, treating o, as our unit of measurement. The means of the 
columnar arrays can be found with great ease and with considerable approximation 
by this method. 


If we now proceed to take the mean of our means duly weighted with their 
frequencies, we find it to. be —-0510,—not a very serious divergence from zero. 
However, we subtract it* from the means in the fifth column of Table XVI, 
multiply the squares of the remainders by the corresponding frequencies, sum and 
divide by the square of «,. Thus we obtain 


1:'818,8034 

Pe = D5 

? = 7-211.9103 5210144, 
or: n = 502148. 


If we divide by the class index correlation of the #-variate, 1.e. ‘962,3297, we 
obtain 
n = "5218, 


which correlation ratio we may take to be the correlation coefficient and compare 
with our polychoric coefticient 5204 (p. 142). Clearly although our means as found 
by the hypothesis of normal] distribution of the columnar arrays agree only approxi- 
mately with the polychoric means of the third column of Table XVI, they he 
practically on the same regression line, as Diagram IV indicates. We conclude, 
therefore, that in this case as probably in many like cases, it 1s quite adequate to 
obtain the means of the columnar arrays by treating them as normal distributions, 
then determining their correlation ratio and correcting it for the class index. The 
corresponding regression line with the means of the columnar arrays indicated 
will be for many purposes an adequate graph showing the general nature of the 
correlation. 


The general purpose of this paper has now been fulfilled; it has been shown 
how a general polychoric coefficient covering all the data provided in a given 
contingency table may be found, and how a graph may be drawn representing such 
a table effectively. At the same time such a process is very laborious and probably 
will not be lightly undertaken or only in cases of grave uncertainty. The method 


* Correlation ratio without subtraction =*5222. 
+ See p. 142. 
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is one of fitting the “best” normal surface to the data subject to the limitation 
that the marginal totals are exactly reproduced, and this limits the generality. 

An example has been given of the process, but it is seen from this example that, 
the heavy arithmetic does not lead us to any more accurate value for the correlation 
than far simpler methods. Thus: 


Correlation from product-moment = ‘5189 + 0160. 
Polychoric Correlation Coefficient “ Best Fit” = 5034. 
Polychoric Correlation Coefficient “ Product Moment” = 5204. 


Mean Square Contingency, Corrected for Class Indices = ‘5179. 


Correlation Ratio from means of arrays = 5218. 


The latter method, which has been long in use in the Biometric Laboratory, is 
thus, when used with due precaution, seen to be justified by the theoretically 
preferable polychoric method. If a method could be discovered of finding uniquely 
the mean of a columnar array, using all tts cells at the same time, this method 
would still more effectively replace the polychoric correlation coefficient. 


ON EXPANSIONS IN TETRACHORIC FUNCTIONS. 
By JAMES HENDERSON, M.A., BSc. 


(1) WE define the tetrachoric function of order s to be 7; (#), where 
1 d \s-1 e7 3? ; 
(2) = = || — a Webel elev evecare ie iesauelereietauele (sce ieleyeteieisve 1). 
1s(0) = (- 32) (i) 
Other writers have adopted various other values for the external numerical 


factor but this is immaterial. The factor A was chosen because it gives an ex- 
tremely simple expression for the volume of a quadrant of the normal bivariate 
frequency surface, and because for tabulating the numerical values of the functions 
it is necessary to have some reduction factor of this kind to keep them of manage- 
able size. We can usually drop the argument # and speak of 7;.. The values of 7, 
for s=1 up to s=6 are tabled to five decimal places in the book, Tables for 
Statisticians and Biometricians*, for values of 4 (1 —a) (which is really 7, when 
the argument is negative) from ‘000 to ‘500 at intervals of ‘001. With a different 
multiplier they have been tabled by Charlier+ to four decimal places only for 
s=1, 4and 5 (wx=:00 to 3). 
The general form of the tetrachoric function of order s is 


1 : ‘'s—1)(s—2) .. s—1)(s—2)(s—3)(s—4) _. 
(2) = = fam —E ee 546 a a a *— ete} 


——— ¢€ 
N 2ar : 
that is, the ordinate of the normal curve of errors multiplied by a polynomial of 
degree (s— 1). 7, 1s simply the ordinate of the normal curve, while 7, is the area 
of the tail of the normal curve up to a given abscissa w, with the addition of an 


1.9 
— 5x? 


x 
arbitrary constant. This constant may be so selected that t= i = dx, and 
-«o V Zor 


will be found from the tables of the probability integral. It will be equal to 
4(1+ a), if # is positive and $(1—a), if # be negative in the usual -notation. 
Accordingly the expansion of a function of a, f(z) in a series of tetrachoric 
functions, is really the expansion of the difference of the function and a multiple 
of the probability integral in terms of 


ye a —}2?/o? 
Cy + Cy— + Co— +... é 4a > 
oO (oh 


where o and ¢p, ¢, C,... are at our choice. 


* Cambridge University Press, p. 1, and pp. 42—51. 
{ Vorlesungen tiber die Grundziige der mathematischen Statistik, 1920. 
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The real reason for adopting 
Cy Ty + CT, + Co Te + Cz T3+---, 
instead of the above expression, is that the calculation of the constants cy’, c', cy’ ... 
is more direct than that of ¢, c,, c.... because the tetrachoric functions are semi- 
orthogonal functions*. It will be seen that the problem of expansion in tetra- 
choric functions is closely related to a theorem of Laplace. If U be a unimodal 


function of « within the range under discussion and the integral [= | Uda be 


required, Laplace transfers to the mode m as origin so that «= m+ & and writes U 
in the following form : 
U = Uy), e-* (1 +a, + a,&+...). 


He extends the limits to 2% in both directions by supposing U = 0 outside the given 


range and in the integration apples the well-known values of | Be wide, Le. 
Ah Coife'o) 


zero if s be odd, and again if s be even (= 27), 


| ele? ger dE — (Qr —1) (Qr—3)...8.1.V2r oF, 


—0o ' 


It will be seen that Laplace is really proceeding by expansion in tetrachoric 
functions as the process is precisely the same whatever be the limits of the integral 


of U. Following Laplace we develop our function in “incomplete normal moment 
. ee vw gS e-2? Ae : ; sfube! 
functions, ie. TS du+; it is better to use tetrachoric functions. The series in 
—a 297 


tetrachoric functions seems to converge slightly better than that in incomplete 
normal moment functions. 


If we have 
FF (ae) = Gy y+ Oy Ta Oe Ty ce Ogi Ts tee: 
then, assuming we may integrate the right-hand side of this equation term by 
term (Le. assuming uniform convergence) between w and o , 


U a 9 
[Fw di= Oni5 a eh eal 


V2 VB 
since | i : T.dx = 7 BSS eA aan ar RNR GD O85 50 (ii1). 


* A series of functions fj (x), fo (x) ... fy (v) ... fy (wv) is orthogonal if fr (x) fy (v) dv =0 when s and 


s’ are not equal, the integration being throughout the range. They are semi-orthogonal if 


iE (x) fy (a) 7) (x) dx =0, 


¢ (x) being a function of x peculiar to the series. In other words a. system is orthogonal if the sums of 
the products of different order functions vanish without weighting for xz. A system is semi-orthogonal 
if we require to weight the values of x to obtain the vanishing of the product sum. This weighting is 
the great disadvantage of semi-orthogonal functions. In our case of the tetrachoric functions the weighting 
factor is e**” or the tails of series are excessively weighted. 

+ Discussed Biometrika, Vol. v1. p. 59. Tables of these functions up to s=10 are given in Tables 
for Statisticians, pp. 22—3. 
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Let 7, = 
in (11). 
Let 7, be another tetrachoric function and suppose s’ is greater than s, Then 
Z Sly el fe a\i Ne 4 
Lp Tye" dx i Oe (- =) SS du 
ie ; Vs! Vs") VQar op es Orde V Qa 


Now since 7;(» ) and t,(— ) will always be zero owing to the exponential 


—— ¢~™, where ps_, is the polynomial in w of degree (s — 1) 


1 
poe 


d 
factor (s > 0) we can integrate by parts transferring the Te from the exponential 


to the polynomial, therefore 


: ae dg = I = a = (- i) veel a 
fo TsTs 6 a= Te Nae oe Psa Th = a 


Peat) aor da.  « 


The integrated part at every step vanishes at the limits and niGmaately 
pee } 1 1 7? df ev 

Tet das i 
ie ee. Vsti! ree da*— jPeaa 5 Me 


Since ps; is a polynomial of degree (s — 1) and s’ is > s the ditferential of the 
polynomial vanishes, 1.e. 


I Tet C—O names EIS Gee we eataya nena c snes (iv). 
If s’=s then the differential of p,_, reduces to (s — 1)! so that 
ie ee — © ah i. — He 
=a Br Obs Gee ese scts hath acsad vob una tiies (v). 


These equations (iv) and (v), which give the fundamental properties of the 
tetrachoric functions, enable us to expand any function #'(xv) in terms of tetra- 
choric functions if we can find the value of the integral 


- Lt al 
| F(a) 1, ¢" de=—— | = Dae AC ee ee i 
— 00 \ Qar -% Vs! 
Since ps; 1s an integral function of w, this amounts to saying that we can 
expand any function of which we are able to determine the successive moment- 
coefficients. 


The practical value of the functional expansion when obtained is, however, 
a very different matter. That depends on the convergency of the series and our 
experience has shown us that in the most common cases the convergency is so 
slight or non-existent as to render the expansion idle. 
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The matter is a very important one for Thiele*, Edgeworth+ and Charlier 
have proposed to treat skew frequency distributions by a process, which amounts 
to the same thing as the expansion by tetrachoric functions. 

An attempt made many years ago§ to expand Incomplete T- and B-functions 
by Laplace’s method in Incomplete Moment Functions convinced Professor Pearson 
that little was to be gained by a series expansion in the form of a polynomial 
multipled by the ordinate of a normal curve. A variant of this method, that of 
expressing Incomplete [- and B-functions in a series of tetrachoric functions, was 
tried a year ago and it was found that except for a small distance round the mode 
this method of expressing a frequency distribution was quite ineffectual. The 
matter is of considerable importance because quite recently a Scandinavian actuary 
in America|] has been analysing mortality curves by tetrachoric functions and 
asserts not only that they give a good fit but apparently believes that each function 
of the series has some natural physiological meaning! It is quite possible to re- 
present the survivors of 100,000 persons born in the same year of life by a Fourier’s 
series from 0 to 100 years but one would hardly claim any special physiological 
significance for the individual periodic terms™. Such a series however is far 
easier to deal with in later treatment, such as differencing, than a series in tetra- 
choric functions. 

For the numerical calculation of the tetrachoric functions the difference 
equation of these functions is invaluable, ie. 


Ts = BR 5Ts1— YsTs—2, 
where w is the argument of the functions and 
1 s—2 
Bs=—=3 Vs = > - 
"As "Ms (=D) 

Tables of 8, and y, are given in Tables for Statisticians (p. 1 of introduction) 
to five decimal places for s=7 to s= 24 (the first six tetrachoric functions being 
given on pp. 42—51) and in Biometrika, Vol. X1v. p. 130 to 7 decimal places. 

For our work @, and y, were required to 7 places (sometimes to 8) to obtain 
the requisite accuracy. The procedure consists in calculating 7,, which is equal to 


—s0" 


__, directly to the required degree of accuracy and then by means of the tables 


referred to above the higher tetrachoric functions are obtained in rapid succession 
on the machine for a given value of the argument. In the testing of our tetra- 
choric series seven-place accuracy was aimed at so that it was necessary to calculate 
7, to eight places, which was done with the help of Vega’s ten-figure logarithms. 


* Forlaesninger over Almindelig Iagttagelseslacre, Kjsbenhaven, 1889. 
+ Royal Soc. Proc. Vol. ivi. p. 271, and in many papers, Journal of R. Statistical Society. 
+ Vorlesungen iiber die Grundaziige der mathematischen Statistik (Hamburg, 1920), p. 67. 

§ Biometrika, Vol. v1. p. 68, 1908. 

| Arne Fisher, Casually, Actuarial and Statistical Society of America, Proceedings, Vol. 1v. Part 1. 
No. 9. 

q A normal curve, for example, is quite adequately represented by two or three periodic terms; see 
Phil. Trans, Vol, cuxxxvi. A, p. 355, 1895, 
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(2) It is well known that a wide range of frequency distributions can be 
adequately represented by one or other of the curves 


y=yye ve (1 + =) RAO ce soe ee ri ee (| 
e\m —1 e\mo-I [ (vil). 
ene Y=Yo (1 + =) (i a) wees (0)| 


By a change of origin and the appropriate stretch or squeeze these may be 
reduced to 
PSY OS Beddnooannootnee Wee coke (a) hea 
oa \ (vil) bis. 
and RSD Fier eed (cero) ks asta pee re OR era ea (b)| 
Now, generally, it is not the ordinates of these curves which are required but 
the areas of certain portions, or in other words the probability integrals of these 
skew curves. The total range for (vil) bis(a) is 0 to o and for (b) is 0 to 1; since 


[ Cee dail Gp) 
Jo 


1 
and i gett a) dz Bim, m,), 
0 
we may take these probability integrals to be 
T( : | : 1 
a) = —— uP e—” du 
oy P(p)Jo 


7 1 
B(m,, mz) - 


which are the ratios of the incomplete to the complete ['- and B-functions. 


. 
and Bw, m,, m,) = [ gm-1(] — y)m-t dy, 
0 


The equations on p. 158 show us that if either of the frequency functions (vil) is 
expressible in a series of tetrachoric functions their probability integrals (assuming 
convergence) will also be. Now there is no doubt that a large mass of material does 
not differ practically from the forms in (vil) and accordingly if the above probability 
integrals cannot be adequately expressed in a series of tetrachoric functions, we 
may be certain that tetrachoric functions do not furnish a suitable method of 
representing skew frequency. Accordingly our problem reduces itself to the 
following one: Can J (p,v) and B(v, m,, m.), or the Incomplete T- and B-functions, 
be represented with adequate convergency by a series of tetrachoric functions / 
After examination of the numerical and graphical results obtained, we are obliged 
to conclude that the answer to this question is in the negative. 


(3) Let us first consider the expansion in tetrachoric functions of the function 
Be CP (OP) ae ase siacoan ins deme tiarennt (vill). 


In expanding this expression there are at least two methods, which we ought 
to consider, and one may have advantages over the other as far as convergency is 
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concerned. It may be expanded with regard: (i) to the mean and the standard 
deviation, or (11) to the mode in the manner of Laplace*. 
(i) The mean of the function (viii) is easily found to be at «=p, the mode is 
at # =p —1 and the standard deviation is Vp. 
Referring to the mean as origin the function becomes 
a (E+ pes e— (E+P) 
y = mum 
e- @p d gE 
Let y=¢(- D) Woz , where D= FE and z= ae RI eA ok: (x). 
Except for a numerical factor the right-hand side is a series of tetrachoric 
functions. 
Let $(— D)=e,-—4,D+¢6,D ...(—1)8 ¢, D8 + 


The function ¢ (— D) has to be determined, i.e. we require to find the succes- 


sive c’s: be 
sen feo oem 


= OT +6,V2! T+ 0,V3! Ty 4+... 4+ C54 Vs! Te ee XD): 


To determine the c’s. With the origin at the mean the function y must be 
taken as zero from — © to —p, while from —p to +o it is given by (ix). The 
c’s will be obtained most easily by multiplying both sides of (x) by e* and equating 
the coefficients of powers of @ on both sides of the equation, i.e. we make all the 
moments of the two expressions for the curve the same, for the coefficient of 6° on 
either side is the sth momentt. Thus 


[-vetde= |" eg Dy ae: 


but y = 0 from =— © to — p. 

mre e( (Ep) 1 eo (E+p) 
P lr (p) 

Now w=p+é and z=E€/Vp. 


e9 ( (2—DP) gP— le-% en Ai Cue 
5 ees (n/p =) bh (— eee FE 
Thus [ We) da vp] e(nP 2) & (— D) a= dz 


The left-hand side is equal to 


fe 6) 
° @P-1 e-& (1—@) 
en [: ay 
0 ai (p) 
7 6) 
eee ev du 
el | == Let (1 — 6) =u. 
» dor TG) 8) 
=e? (1 — 8), 
* Laplace’s method is really an expansion in incomplete normal moment functions but as we have 
seen (p. 158) these may be replaced by tetrachoric functions. — 
+ We owe this elegant method of determining the c’s to Mr H. E. Soper. Originally the ¢’s were 
determined by use of the fundamental property of the tetrachoric functions but that method, while 
leading to the same result, is more laborious. 


Accordingly [ ° Te Ie ef (— D)! = dé. 
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To find the value of the integral on the right-hand side, consider the term 
cs (— D)* in the function ¢(— D). Its contribution to the integral is 


bas rye d ante 
Cs | ( i) = dz, 
where 6’ = 6 Ve 


On integrating by parts the term between limits vanishes owing to the factor 
e—#". Hence the integral 


— f ° Oz -£\ = 
c,0 Je ( - ioe 


and ultimately 


1070? ia Bees 
= Dee 
= 0.0 © er 
Therefore the whole integral on the right is 
p (0) et, 
ie. - 6 P9(1 — 6)? = Vp b (Vp8) e®, 
or Vp o) (Vp0) = e-P9- B® (1 — 0)», 
and b (V pO) = cy +c, (V pO) + co (V pO)? +... + 65 (Vp 0s + 
= Cot C1 O + Ce OF cccccceecees + ¢5/ 08 + 
where cs =c,(V'p)> or Cy = Cs, (Vp), 
Now e—P9—sp® (1 — @)-P 


= e- pd—zp0’—p log (1-6) 
= e—p9—kp0?+-p0+4p6?+ kpe'+}po'+... 
= esp +} Lp0t+tpe5-+., 
=b,+6,04 6,07 + Sie +b,0¢+ 
where 
5 =1, b, =b,=0, b:=4p, b,=4p, b;=4p, bb = hp +4 Apy = kp (p + 38), ete. 
But Vines = b,, therefore 


1 e 
C= Ip Oe SO ie = ip, Gy =1Vp, cs =1Vp, ete. 


a 


2! = 1 ee i 
ae C, = ©. mer Ce ais. = 55 
For numerical purposes these coefficients are much more usefully obtained in 
the following way: 
Let e~ PO 2P® (1 — 0)? =), + 0,0 +b. + . 
Take the differential of the logarithms of both sides ; “a 
(— p—pO+ p/1— 0) (40,0 +b.0 +... 4+ 0,08 +...) =b, + 2b.0 +... + sb, + ..., 
Le. pp? (bp + b,0 +... + b.68 +...) =(1 — 6) (by + 26,6 + ... + bs8! + ...). 
f= 


so that Gr 
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Equating coefficients of @° we have 


pbs—s =(s+1) bs41 — Sbg, 
| pet F 
1.e, Deni = seal {sbs + pbs} eve lersuekay sveetntelekerstetersteretenatcretstatere (x11). 


By this difference formula successive 0’s can be found very quickly if by, b,, b, 
are known and we have already found these. 


Now cs = (Vp) es. 
= (Vp)-* bg (Vp) = by (Vp), 
or by = (Vp \sr1 Cs. 


Substituting in (xi) 


= d = Be 
(Vp)st? SCs = (s+ =) {s (Vp) Cy +p (Vp) Coma 
s+ 
soe {SCs sr Ces}. 
or PL ae, is cai 
ity Vp (s vi 1) SCs GD). 19 paale el acerca aiutats vine tase eUolere inte D, ° 


This formula gives us very readily the coefficients of @(— D) and thus the 
expansion is obtained. 
We had, Equation (x1), 


eae ve /2p S12 ee 
Toye as +6,V3!7,+..4 Fe, V6 Dts 


and all the c’s are known since q = —,  =@,.=0. 
i 

To find the area under the curve (xi) up to abscissa #, remembering that the 

left-hand side is zero from & =— © to —p, 
é (E+ pj? e- (E+p) I e nee 
; -d D = me 
ry | CP) a 
: aP—e ae te Bas 
1c: | aCe =Vp ol D)- = 
= ae (7 +0,V2! To Ae eset Cs 24 vs! T+...) dz. 
Now Re idee 
-~o Ss 

therefore 


i as vp [a]. aoe om ees -—¢ V8 1 toa | 
Tey 90 | 9) ae a cr 


— ears aes ar 1 Z 
=4(1+a,)—Vp {C7 + CpV2! tT) 4 res +63. V(s—1)! Tat vA Bee 0 Vp 
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Therefore finally 


gPrl er ; 
i Ga (py da = 4 oo as) Sa Ugitheg\ ago sree OU s\ Tigh" nicrsi Maeve ¢ ste epee aces (xiv), 
(since c,=¢c,=0) where a,=Vpe,Vs!. 
Now ¢4)— 7 a {se, + cs} from equation (xiii), 
ai Ay, I a ‘ ds is ante —, 
= Vis+1)! vp (stl) vs! v(s— 2)! 
ie re eee 

therefore As41 = O Eau {sV(s + 1) a,+V(s + 1) (s) (s — 1) ayo} 

=4/ aay Wie, SEN (SVS) ok aes eee (xv), 
where a=1, a=a,=0. 


The argument of 3 (1 + @) and of the tetrachoric functions is €/Vp, which equals 

eae say. 
Vp 

Since the terms 7, and 7, do not appear one might hope that only a few terms 
of the expansion (xiv) would be required to obtain a sufficiently accurate result. 

$(1+4,) is the ordinary probability integral at 2. 

Note that if # is less than p, ie. z is negative, (1 — a.) must be used instead 
of 4 (1 +4a,) and the tetrachoric functions of even order must be taken of opposite 
sign to those for positive z such as are given in the tables. The odd order functions 
are the same for positive and negative z: 

Tos (z) = — Tos (— Zz), T2341 (z) = Tos41 (— 2). 

Obviously we could get the area of any portion of the curve between «=a, and 

« = a by subtracting two expressions like (xiv) for z, and z,. 


Pe mae 
The general expression for - Tip) 
Giaveste. 1 v4! 1 51 
— Ul ames ome pear a 
I'(p) Vp p 3 pvp 4 
1 V6! L ee) 
72 R76 See ey 
1 V8! my, 1 V9! (47p + 60 
p rt 12 prvp 8 60 [ 
1 V10!(p? | 19 5 153 
p 9 [et appt tt ous pvp 10 136” + ape +1 
1vi2!(341 , 341 
+p 11 (1440? * a90? t " 


ey 


162 * 1440? + 9520? +1 


1 13! (p* | 4938 , 3349 
12 Ti3 + 
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and 


L pp—l p—x - 
| 7 < dx = x (1 a az) — : val Ubi i oct 
0 (p) Vp 3V4 P4ANV5 

: 6! 1 v7! (p+8) 1 vs! (ee = 
> T7 


pvp 5v6" pev7\ 3./ °° pvp TVR 


T4 


12 


1 v9! {fp + 60 ye VLOUE( DG LS aes 
T= — 
75 pg V9 | 60 pvp 9 710 pee 20? t |r 
1 viii (Fe +ie +i} 1. 121 (341 2 4 od +1] 
TPF 10/1 (308 402 Be ah alae, T1vi9 (1440? tT ag9P Thy ™ 


1 V18! (p? 493 , 3349 
~~ ie <1 ET +6 ts AAC (iain ays 
Pp’ 12V13 (162 1440 2520 
°8164,9658 1:2247 4487 
ans Sa T4 
Vp 
2°1908,9023 1:4907,1198 
a ve Ts (p se 3) T% 
pNP Pp’ 
_ °8451,5425 aa OO1 
SEE Gy ln ee a a 
Pp lp p 


'3718,489 
Set e0 lO alrip e180) = eee rsp 2 + 1377p + 1260) ty, 


p+ Ty 


pvp 
'0569,8743 
pp 
_ 0201,0408 
P 


(2387p? + 12276p + 10080) 7, 


(560p* + 31059p? + 120564p + 90720) T» — ... 


(u) Laplacian Form of Expansion. 


This is an expansion with regard to the mode or maximum ordinate as origin. 


pl p—-& 
The mode of y= = we) is at e=(p—1), so that it will be easier to deal with y 
in the form 
ae ge en” 
I~ Pp +0) 


where p’=(p—1). 
Let «=p’+&, ie. take the mode as origin. Then as before we require to find 
$ (— D) so that 


(p’ + E)P e- (p'+é) he. e728 /p' 5 
SICH =o(- p= 7 (xvi), 


7 d to 
where ee and are 


The introduction of Vp’ in the denominator simplifies the integration a little. 
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Proceeding as before : 


of e% (p' ar bye e7 +8) fe 22*/p! 
5 LE = eS Dae, 
ae Tao oe aa dé 
: rr ef (@—p') pP' e—% 22d, oo alps : ee 
1.e. if Torey af 1 od ( D) Van 
or rie — if (ON pz D oars dz 
CaO rt sae mee iE 
eoND’ z-—$2? 
and eg Pe (1 — 6)- et) = erp) | dz 
529 
= [2% en kE—ON Dp’? 
a 1) ohp'e? ¢ 
bh (OV p’) et es c= dz 
= $(ONp)) ee 
therefore (ON ip’) = e- POP (1 — BYP cee 
Now if 6(—D) =o,-—¢D+eo,P?+...+(—1)%8¢.D%+ 
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(xvi). 


b (OV p') = +0 (OV p') + (OV p P+... $0,(OV p> + ... 


=qt+¢/0+c,0?+... +6608 + 
where : cy =¢,(Vp')® or Ce= cs (Vp')s, 
e7pe- tp (1 — 8)- (p'+1) = e-p'0—}p'0*—(p' +1) log (1-6) 
= e-P0-3P'O-+(p' +) (0+5+5+...) 
(p' +1. +) 6 ee) Nie 


= gothe+ 


Eee a 


where : Che Cy lees. afm pret gk, 
and generally by differentiating 
Cy =C sat e Cs, 
or Gs (Vp’)s = Cs_1 (Vp')s3 a £ Gass (Vp')s-3 : 
1 1 ee 
thus G=— Jee ++ = = es} AS ete eee Ee (xvii), 
MP. 
where qo=l1, q= ee Cy = oS 
Vp’ iv 
Therefore 


Pe-(w'+6) — £2/2p' 
tise Meas {ey — 4. D + 0,D? — ... (— 1)* ¢,D¥ — ...} (<= 
NV Qarp 


[(p' +1) 


) 


1 = _ eieanee 
= — for, + V2! qt, +3! cots +... 4 (6 421)! Gorge > sae 


Vp’ 
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To find the area up to abscissa # we have 


é (p’ eye er (p'+é) 1 I —— —————e 
Gs S—— oT tV2! tet... +V(st1 WCetrgat eae sane 
jie C(p’ eile E Vp" oo ( ) Ts+ } S 


=), {CoT + V2 Veit se eee V(s +1)! esTe44 +...}dz 


=o + a) —¢,7,— V2! OT. —V3! 6,73 — ASN Siiveln, ee as ¢ = 1, 
: v gP' e-& 1 y , ’ , 
1.e. ra 7 eed + Az) — Ay Ty — Ae’ Tz — Ay T3 — «0. — As Ts — oe, 

po Cp = h) 
where a, =0,Vs!. 


Substituting in (xvii1) to obtain the difference equation for the a’s we have 


eas 1 | Gas il Pa \; 
vst vp’ W(s—1)! 8 V(s—38))” 


he call ie OES 
therefore a, = TE i {sa's4+ (3 = 1) G22) as) se eee (xix), 
| , 2 
and ih = lh — a =—. 


vp iP 
By this formula the a’s are readily obtained numerically. It is to be noted 


that in this case the terms in 7, and 7, do not vanish, as they did in the expansion 


from the mean. The argument of }(1+a,) and of the tetrachoric functions is oa 
and the remarks with regard to sign made above must be again observed. 
Coefficients in the expansion from the mode : 


: 1 ; /2) 


a =1, a=—=, @&=—, 
Vp Ww 
», NBL e ZI ; v4! (Tp' + ae 2S, v5} ot 
a; = == a= - a; =— = —— , 
pvp 8 pt ae) Pepe e! 
,_ 6! p? 19° ey pe 5 iS 153 Cae) 
Ag = pp? 18 20) P iar j » pvp’ 36? as 140 P ats 


af = N8EL B41 , 34d eat, Boe ee oy 2880 ae 
8 = P °* yevp (162 1440? * a520P F*f- 


G00) 0.'0 0b: a, 80:0 0:18: 0.0 6.0 010.0810) 0)0 0 6 6.0110:0.0 00-6 0.9 0)6)'0)0 "6/6 0fe.0:0! 6,le)e\0: 6's: 60i6 b10i0\00101 050/018. 0. 4.6 aieL6!06;'e>u' a atahe' serene viel enelereleleteleretetel 


ag. ae m5 Ws Cire a aah 


We note that the coefficients of powers of 6 in the functions 


ae atete} 


pt 2 pl 


and SO=e 8s Sie 


A+... 


(in the expansion from the mode we had p’ for p in ¢’(@)) are closely related. 
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Then if ¢, is the coefficient of 6” in @(@) and c,/ is the coefficient of 6” in ¢’ (8), 
Ch = £ Crass 


(4) In the last expansion it might seem possible to get rid of the terms in 
tT, and 7, by breaking away from Laplace and expanding with regard to e~#/% in- 
stead of e~#”'; then choose g to give us the desired result. In Laplace’s form of 


the modal expansion the exponential term is e at) ma , where w=logy and 


-at the mode. 


2 au 
(sa) means the value of 
Mo 


da? dx 
ge’ e-® 
aa Y= T@ +h)’ 
Co loge y =p log. «—#—log. P(p +1), 
du =" 
i Bae 
du = i 
dx? a?’ 
therefore (33) ee ee a : 
da] Mode P- P 
rm p (a yeas d E 
ie — where D=— and z =—, we have to find gq, so 
ren Aa de iF ! 


that either the 7, or 7, term or both will vanish. 
By proceeding as before equation (xvi) becomes 
(0g) = eve (1 — Ay 
= e—P'0—298— (p' +1) log (1-8) | 

The term in 7, will vanish if g=p’+ 1 which is the square of the standard- 
deviation from the mean, but 7, will still be left. However, it does not seem 
likely that any advantage will be gained by departing from Laplace’s form of the 
exponential term. 

Having found the two expansions from the mean and the mode respectively we 
shall now proceed to examine the behaviour of the series by numerical calculation, 
but before doing so we shall endeavour to find a similar series for the Incomplete 
B-function. 


(5) To expand | 


the mean. 


x oP al — a) 


“Bip, 9) 


dx in terms of tetrachoric functions about 


The mean is at xv =p/(p+q). 

VPq 
(ptq)Vpt+qtl 
Take origin at the mean; then w=p/(p+q)+& Let 

a? (1 — a) I e— tie? 


= SES oe acted 0.6.5 F 
~B (p,q) ae oe 


The standard deviation is o= 


where D=—, y=. 
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As in the case of the Incomplete ['-function multiply each side by e® and inte- 
grate. The limits of the integral on the left-hand side will be «= 0 and #=1, as 
we take the value of the integral outside these limits to be zero. 


The € limits will therefore be —p/(p+q) for a=0 and q/(p+q) for «=1. 
Then 

Gee? Gecrigh Deg) am . 

1—(E+ ppt dea | 

(le B(p, 9) (E+ p/(p q))} E : 


28 (4 — PKETD) g \p-l q-1 0 
i e geese oe de= | elu b (— Dye dy, 
0 el oa 


2 


=H: 
eng dé, 
/ TO 


1.e. 


B(p, g) 
e% Pl (1 — a) a ry? 
—Op/p+q = 
and e79vl1 aR SpA) dix (00) | seca 
F Na —00)2-+36%02 i 
= >( o){ = y 
=" (00) Cee Te rockin th eee eee (xxi). 
Now 
Celtel ND ies et 
ee 1 
if B(p, q) ate 
_ aP- (a) ae C222 C243 O8u¢8 
=i seer Bip @) {1+ + 02+ 57 + 37 omer euy Sem da 
_ Bip, 9g) pBtlg, & BCD e ae &® B(p+s, 4g), 
B(p, ) B(p,q 2! ~=Bip, 4) s! B(p,q) 
But Bias ie p(pt+l1)...(p+s—l1) 
Bw,.g) “(p+rqQi@ tg hl)e(p os 1). 
PA Soe Pp & (Gee 1) 
theref {= © Ga=1 40 == 
eG: ptq 2! (p+g(ptqtl 
O° Dp Coa h) eek Dee Sal) 


ae 


From equation (xxi) 


AN Goes GORE vigeal)) se org s= 


pe P 2 
era p e p(p +1) 
Oc) =e pta 1+0 + 
Ma ae pt+q 21(p+q9)(ptqt)) 
6s pip tl) a (pas) +} -. 
ie tsi@en@tetD- (ptqtse—l) fo oon 
Let $ (— D) =a, — aD + a, — ...(—1)° a,D* + ..., 


f (Oc) = ay) + a, (8a) + dz (Oa)? + a; (Oo)? +... +a, (00) +... 
=o+¢0+¢8+...+¢,0°+ 
where C, = Ao". 
By equating coefficients of powers of @ in equation (xxi) the coefficients in 
¢(@c) can be obtained in terms of p and q, for 
fs PY ; 
(p+) (p+qt+1) 
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Obviously 
C= Il, 
g=—pl(p+q+p/(p+q =), 
fue ee ee Pee SPO DS Sn i! 
mee prg) a CpG) (pg tl) 2) (o-eo (pag dl) (p+ 9) 
sist Dee ae q 1 
Zul pr d) (Cogs 1) peg (p49) (p4-9- 1) 
=3152 if q a oe i 
2lp+q) (pta(ptqtl (p+Q(ptgth 
= 0, 


Similarly the other c’s can be determined but the work becomes more and more 
laborious as we go on. 


Unfortunately, as far as the numerical work is concerned, we have failed after 
many attempts to find a relation connecting successive c’s, similar to that found 
in the case of the Incomplete I-function. At first it was thought that the 
following treatment would facilitate the numerical calculation of these coefficients. 


3o° 


ee Lae 
Let e PHA =), 4,0 4b,0° +... +b,0 +..., 
then — p|(p + q) 8 — $0°6? = log, [bo + 0,0 + b.0 +... +O +...}. 
Differentiate this and then equate coefficients of powers of 6: 
(by + b,6 + 6.0? +... +b, +...) (—p/(p+ q) — 078) 
= b, + 2b,8 + 3b,0° +... +5b,0°1 +... 
Equate coefficients of 6°: 


sb, = — p/(p + q) bs_. — ? bs_o; 
therefore —— L p b,, +o b.| ee ye er (xxill). 


This formula enables us to calculate the b’s very rapidly on the machine when 
p|(p +q) and o? have been determined. 


From equation (xxii) 
Cot G04 6.0? + ...+ 005 +... =(b) + b,04+0.07+...) 


4 C2 p(pt+1) 
14+04 P a fetid, 
p+q 2! (p+q)(pt+q+1) 


Equate coefficients of 6°: 


Pale il p(ptl)...(p+s--1) 
Nero Aedt) 
si(p+q)(p+qtl])...(p+q+s—1) 
1 p(p+1)...(p+s—2) p 
wl ae be 4 a ; 

C= Gage iy rqeesd) Tip agt 
ie. es Se ee p(p+1)...(pts—r—-1) 

r=0 (S—1r)! (p+q)(pt+q4tl)...(pt+tqts—r—-1) 
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The b’s, having been calculated previously by (xxiii), this last formula gives a 
fairly rapid way of calculating the c’s, at least the earlier c’s. Then 


ee p(p+1)...(p+s—r—]) 
=o (8—r)!(p+q)(p+qt1)...@+qts—r= 


1) (aj)... (x) 


What we require generally is the area represented by [ a? (1—2)t" da: 
0 
© gP-1 (1 — ¢)71 I @~ 84/0? 
c= — D) — 
I, Big aos ‘NM Qar 


y en? 


=|" o- DS 


1.e. fy 


ie gP (1 — x9 
0 B(p, q) 


iz rE ea TOE” eae ene 
= Ca nl Le =) 
V 2a Var Qn Qa J 


y en eo yu e7ty’ e-w ]y 
= — dy — a, | —— a, | D — ee — 1) pe 2 sf 
foie ae ae oR oe 


=4t(1+a)-a, V1! 1 —a,V2! T —a,V3! Teas Ta ee (p=) 


=4(1 +a) — GT — eT. — As T3 — 10. — OS Ts — oy 
where a, =a,Vs!- 
s! s 1 p+1)...(p+s—r—1 
Then a, = ue = b,- HS estes EG Tag as) (OOK) 


o y-0 "(s—r)! (pt+q)(p+qtlh)...(ptqts—-r—-l) 


Now ¢, and c¢, are equal to zero, so that a’, a,’ are zero. Thus there are no 
terms in 7, and 7,. The arguinent of the tetrachoric functions and of $(1 + a) is y, 


E _ «—p/(p+q) 
Oo - (op 


which is equal to . On applying the above formula for a,’, we were 


greatly disappointed to find, that with the b’s to 8 decimal places the expression 
under the summation sign in the examples used commenced with 4 or 5 zeros 


: : = i : : | erate 
after the decimal point. As Vs! and (5) ‘both increase with s E being in our 
oO 


case > 1) accuracy to the seventh place in our @’s could not be obtained. Accord- 


ingly the formula actually used was of a different type. 
ml OC iS 


=Sec = 
B(p, q) 1 


where the argument of the tetrachoric function is again 


Let 


E_e—plptg 


o 


Multiply both sides by r,, weighting by the factor e#/, and integrate from 
—2x to +, the left-hand side being taken as zero outside #= 0 and «= 1 


qip+q yp (] — % oS ie. 
Then f'n MOUSE are dg= | 1San el" df 
-plp+q B(p, g) =o 2 
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Since [ T,Ty @/" dE =0 only the term in 7, will be left on the right-hand 


side, 1.e. 
oc na S 
| TaS Cyt, HI dE = os 7 etl” dé, 
poe 1 —o 
Putting Elo =y, dé =cdy, 
q/p+q oP 1l—2x)r- ae _ 
we have i ( aa Ts ere lo" dé =. | ae oly ody 
—p/p+q —«CB (p, q) a 
= Cgo 
sV 2a 
i $ V2 [1 le) U2 eae a=pio+a)\? 
Le.  C,;=——— ( Je ote ( ie ) di 


« Jo Bipq) 
eee Nore at (casle+ oy 


o Vs! Vr 0 Oo % 1 ! Oo 
fs DOC Gas, Sie) as ) (1-2) 
eo a “J Bp, 9) 
Bg (met) ne (ie DAS ee) 
as! [ \( 7G. 7 oui ( o ) 
Pes Ol) ronan KS) sat LORE ie ae 
22.2! ( Bs ii B ea 7 
POY ek ee TS (xxv). 


The integral for any particular value of s reduces to a series of B-functions and 
so c, is found. 


The area up to abscissa w is generally required : 


Le, A SS 
i. B &, “ 
- a S (e.r,) dy. 
Now is 1,dy =— = T5- a 
22 vs 


ae Oo? ae y 
[P e - - de=o[' Cy Ty dy—o ts Ty 
1 
os ats tognt. barra tf 


— (eT — 1 Ost — sos) 


CLE 


=<dC, 
Qa 


where @, = C511 » =o, $ (14+ a) — QT, — AgT, — A373 — «2 


whe 
Vs+1 
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If we put s=1, s = 2, s=3 in the above formula (xxvi) for ¢,, 


1 
C=; 
oO 
2 @ — pi(p+q)\ «(1 — a) 
ee ae o B(p, q) ue 
2 1 
7 aN Ba Ge ge Be eee 
==)! 
al 2.1] a3 (1-2) 
= —e | E [* — 2ap/(p +9) + p/(p+qy} — | tes w 
3 


1 
See = B 
Nr lap) LP (p+ 2, q) 2 ee (p+, q) 
p Bo 
+ Getto Bi.a)| 
arate SP (pd) 8 ee ee es Pq 
a6 (pt+q)(ptqt+l) (ptr @t+gr (p+q)(p+q+ 1) 


a 


as obtained before by the other method. 


The terms in 7, and 7, do not exist, so that the expansion becomes : 


xP) (1 — #)t- 
“iis 1+) — 373 — QyTy— -.. — AyTy — eee 
i, B(p, q) 4( ) 373 474 
oO 
where a =o 
Vs+1 os 


ee (Pee) (anes 


Vs+1 V(s+1)! 
eS =Bl@ +a) _ 1} OR 


2.2! o B(p, q) 
ke Se ie «—p|(p+q) ie Ca) x—pl(p+q)\s? 
Sells ie o ) agit ( o ) 
es (s—1)(s—2)(s— 3) e aa 4 = — x)I7} 
22 ou o csi B (p, q) 
a cei eee eee (Xxvll) 


The argument for the tetrachoric functions and for 4(1 +a) is abe I) If 


this is negative then we must take $(1 — a). 


From the above expression (xxvii) the coefficients of the expansion can be 
determined both algebraically and numerically, but for the higher coefficients the 
algebraic work becomes exceedingly heavy. It is to be remembered that 


an PY 
(pang) (og 1). 
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1 
m (m+ 1)° 


Suppose (p+q)=m,; then o? = 


The coefficients a@,, a2, ... etc. are given below: 


I 


m+ (m—2% 
a,=0, a,=0, Ga ai (m — 2p) 


V3! py (m+2)’ 
1 i m? (m +1) 
pees 90), eee ee, He I 
oe ee oe 2) ars) | PY aD) 
1 /m +1 (m — 2p) {m? (m +1) 
p= —— 4! a -—(7 12);, 
ee va Ny pg (m+2)(m+3)(m+4) | pg ee) 
it! BI 1 m4‘ (m+ 1) 
“6 761. (m+2)(m+3)(m+4)(m+5) | pe 
ap BERS) ys — 32m — 60) — 2 (2m? — 41m? — 154m — 120)| ; 
3pq 
oe ley m+1 (m — 2p) ; 
aN Ta py (m+2)(m+3)(m+4)(m + 5) (m+ 6) 
mi(m+1P m(m+1),, eas 
x a 125 (7m + 15) (m — 20) 
. — 7, {Tm — 59m? — 342m — 360)! ; 
1 1 
As 


i. V8! ye (m+ 2)(m + 3) (m+ 4) (m + 5) (m+ 6)(m+7) 
- pes 1) | m(m+1) 
pq? 60p?q" 
m? (m+ 1) 
380pq 
+ gt {1271 mt — 1697 m? — 44512m? — 104364m — 65520) | : 


{47m? — 853m — 2100} 


{— 251m? + 1503m? + 9974m + 10920} 


An additional coefficient a, was calculated for one of our examples, but it was not 
considered worth while working it out algebraically. 


The coefficients in the tetrachoric expansion obtained by this latter method, 
that is, by using the property of tetrachoric functions as semi-orthogonal functions, 
are identical with those obtained from the first method, which consisted in equating 
moments of the functions on both sides of the equation. Thus we are led to the 
"same expansion in both cases. 


(6) The numerical results are certainly interesting but from the utility point of 
view they are not very satisfactory. Tables I—VIII contain these results in a 
convenient form; the values of the coefficients a, and a,’, the tetrachoric functions, 
the successive terms (— a,7, and —a,'7s) and the values of the series up to the term 
containing 7; are given. It is to be noted that the coefficients do not appear in 
Tables II and IV but as these are the same as in Tables I and III respectively it 
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was not necessary to repeat them. In all these tables, in the row s=0 we have 
placed $(1 —a) in the column containing the tetrachoric functions and it is only 
necessary to draw attention to the fact that in the next column the negative sign 
in —a,Ts does not apply to the first term }(1—a). The tables will then be easily 


TABLE I. 
294 gy SBy— 
= dk, 2=—28%, 
| o (49) 
| | 
& Tetrachoric Terms in _| Value of Series 
as | Functions 7, Series — agrs up to term 7, 

0) 100000000 + 0025551 | + :°0025551 0025551 

1 000000000 + "00791545 — = 

2 0:00000000 | —:01567180 | — _- 

3 | 0°11664237 + 02210325 — 0025782, | — :0000231 

4 | 0°02499479 — 02189644 = + °0005473 “0005242 

5 0:00638743 | + 701259137 — 0000804 | 0004438 

6 0032285381 + °00159776 — 0000516 | “0003922 

id 0°01785148 — 01140536 + 0002036 70005958 

8 0:008 40223 + °01000967 — 0000841 *0005117 

3) 0:01470566 -+ ‘00006659 — ‘0000010 “0005107 
10 0:01282194 — 00849985 + *0001090 ‘0006197 
11 | 0°00895618 + ‘00711870 — °0000638 0005560 
12) | 0701042333 + 00164419 — ‘0000171 “0005388 
13. | ~=0°01079260 — (00754632 + 0000814 *0006203 
14 | 0:°00962776 + 00418464 — 0000403 “0005800 
15 0:01015854 + °00374438 — ‘0000380 00054 19 
16 0:01102777 — 00640271 + 0000706 | 0006125 
17 0°01128126 = +°00094254 — ‘0000106 “0006019 
18 0:01209893 + 00523425 — ‘0000633 | *0005386 
19 0:01345974 | — 00422873 + ‘0000569 0005955 
20 0:01483350 — ‘00218561 + :0000324 0006279 
21 0:01660082 | + "00525599 — ‘0000873 "0005407 
22 0:019019382 | — 00110396 + 0000210 0005617 
23 0:02196131 = —-00426227 | + :°0000936 *0006553 
24 002561864 | +-00346981 | —*0000889 | “0005664 
25 0:03033429 + °00205905 — -0000625 “0005039 
26 0:03631783 — ‘00489701 + °0001597 ‘0006636 
27 0:04391748 + 00042653. | —:O0000187 | -0006449 
28 0°05371111 | +°00393217 — 0002112 0004337 
29 0:06638572 — ‘00244866 — + :°0001626 ‘0005963 
30 = -0°08285325 — 00248099 + *0002056 “0008018 


True value ‘0005850. 


understood, but, in order that a better appreciation of the results may be obtained, 
the value of the series up to a certain term has been plotted against the number 
of that term. A line, drawn across the paper and corresponding to the true value 
of the integral, shows how much the value of the series is in excess or defect of the 
true value of the integral. The various points have been joined by continuous 
wavy lines but, of course, these lines have no real physical meaning. However, by 
joining the points, the graph will, we think, convey a better idea of the variation 
w—p 29°4-49 


2=>—= = ——= —2°8, 
Vp 7 


4H 
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of the values of the series than a set of isolated points would. Figures 1—7 corre- 


spond to the data given in Tables I—VII. 


Now in the case of the Incomplete [-function we obtained two expansions, with 
respect to the mean and the mode respectively, and the graphs tell us which of 
these two gives us the better approximation. Figs. 1 and 3 (Tables I and IIT) 


show the variations in the values of the series for ie wags dx from the mean 
Jo D(49) 
and the mode respectively, while Figs, 2 and 4 give us similar information for 
42 948 p—X 


-00400 i ao ae SL Cm as Ce kt ee ee 


‘00300 


‘00200 


= Seas me 


001001} 


‘O0000H ‘0’ 


- 00100 
‘e(l-a) 73 Ty 5 TT eM To Ty Te Tis Ta Tis Ne 7 Tie Ty Teo Ter Tye Tes Te4 Ls Toe T27 Tog Tog To 
NUMBER OF TERMS 
Fig. 1. 


e\\-a)T3 Ty Ts 1 Ty Tg Tyg To Wy Te Ts Ta Ts Ne Ti Tig Tg To Ta Te2 Tez 124 Tes Tee 127 Tos eg Tso 
NUMBER OF TERMS 
Fig. 2. 


It will be seen that in Fig. 1 the points are much closer to the ‘true value’ 
line than in Fig. 3 (and similarly in Fig. 2 they are closer than in Fig. 4) so that 
the expansion from the mean seems to give a better approximation than that 
from the mode and it has the additional advantage that the terms in 7, and 7, are 
missing. Besides, it seems more natural to expand these normal curve functions in 
terms of the mean and standard deviation. For comparison purposes the graphs are 
all on the same scale. The graphs for the mode and the mean behave in a very 
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TAR LnT ) pececie, 1* 
- |, 149) Hey fy : 
Tetrachoric Terms in the | Value of Series 
. Functions 7, Series —dgT¢s up to term 7, 
=| = 

| 30 +°1586553 | +. *1586553 *1586553 

1 +:°24197074 | “0000000 — 

2 — ‘17109916 “0000000 = 

a: “00000000 ‘0000000 — 
4 + 09878417 — 0024691 "1561862 
5 — ‘04417762 + 0002822 | *1564684 
6 — ‘05410632 +°0017468 *1582152 
7 + 05453404 — 0009735 °1572417 
8 + °02410087 — 0002025 | *1570392 
9 — ‘05302190 + 0007797 | "1578189 
10 — ‘00355664 + :0000456 ‘1578645 
11 + :04657133 — 0004172 *1574474 
12 — °01034833 + :0001079 °1575553 
13 — 03814548 + :0004117 *1579669 
14 + °01939964 — ‘1001868 *1577802 
15) + °02921077 — ‘0002967 °1574834 
16 — 02483411 + 0002739 1577573 
17 — 02054429 | +:0002318 *1579891 
18 + '02755708 — '0003334 *1576556 
19 + °01256341 — ‘0001691 *1574865 
20 — (02825493 + 0004191 *1579057 
21 — ‘00548187 + :0000910 *1579967 
22 + 02745951 — 0005223 1574744 
23. | — 00060803 + 0000134 "1574878 
24 — 02558848 + :0006555 "1581433 
25 + °00568862 — ‘0001726 *1579707 
26 + °02297227 — '0008343 *1571364 
27 — ‘00978859 + :0004299 *1575663 
| 28 | -:01987296 + 0010674 *1586337 
| 29 | +:01296514 — ‘0008607 *1577730 
30 | +°01649808 — ‘0013669 *1564061 


Ron 


True value °1577387. 


I 
(l-a)T Tz T3 % % % T Ty T To Ty Te Ts Ts Ms Ne Tz Ts Tig To Te) Tez Te3 T24 To5 To6 127 Tog To9 T30 
NUMBER OF TERMS 
Fig. 3. 


oe 


* 


zp _42-49_ 


z=—_ = males 
aa 
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similar manner; for, if we regard the graphs as a wave, it will be noticed that at 
first the amplitude of the wave is big, decreases gradually up to a term in the 
neighbourhood of +) and thereafter increases more and more rapidly. This can be 
explained fairly easily ; as s increases the tetrachoric functions 7, do not increase or 
decrease steadily but vary in sign and remain of the same order of magnitude. The 
coefficients a, vary in much the same way (except that they are all positive) up to 
a certain point and then begin to increase very fast. In equation (xv) we had 


== - | 
n= = 5 {/sa + V(s — 1) ass}, 


: é 8 ; : 
Le. As, 18 of order \/ — {as + Aso}, So that as s increases there comes a time when 
Pp 


= and then the coefficients will continually 


Vp 


\/s overcomes the reducing effect of 


4.19324 


T eS Peay ee ee a (ea eet ee 
2(Il-a)T Ty T3 % 5 Te Ty 1 Ty To Th Te Ts Te Ts To Wz Ts Tg To To Tee Tes Tes T25 126 127 Teg Toq T30 


NUMBER OF TERMS 
. Fig. 4. 


increase. For higher values of p this turning point will not be arrived at so soon 
and the points will hang closer to the ‘true value’ line for a greater number of 
terms, but it does not seem likely that the values of the series will tend to a definite 
limit. The equation for the modal expansion coefficients is a similar one and these 
coefficients behave in the same way. 

Turning our attention to the expansions from the mean, Fig. 1 (and Fig. 3 to 
a less extent) would seem to suggest that the tetrachoric series gives quite a good 
approximation to the value of the integral. Although some of the points are very 
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TABLE III. 
| ee 2°6846788* 
——_ Qy, g=—Z' 2 
WP Cee 
, Tetrachoric Terms in | Value of Series 
. | as Functions 7, Series —a,'T, up to term 7, 
0) 1:00000000 + 0036296 + 0036296 ‘0036296 
1 0°14433757 + 01085979 — 0015675 | 0020621 
2 0:02946278 — 02061573 + 0006074 | 0026695 
3 0°12521683 + °02752089 — 0034461 — ‘0007766 
4 0:06166251 — '025038988 + 0015440 ‘0007675 
5 0:02648957 + :01160193 — '0003073 0004601 
6 0°04236301 + 00557065 — 0002360 "0002241 
a 0:03460284 — ‘01460370 + °0005053 0007295 
8 0:02288715 + :00939504 — 0002150 ‘0005144 
9 0°02516285 + :00363988 — ‘0000916 *0004229 
10 0:02488684 — ‘01101274 + :0002741 ‘0006969 
11 0:02136289 + °00579095 — ‘0001237 0005732 
12 0:02167770 + 00509737 — ‘0001105 ‘0004627 
13 0:02272772 — 00713419 + 0001621 "0006249 
14 0:02256727 + :00058475 — 0000132 ‘0006117 
15 0:02351439 + °00599464 — 0001410 0004707 
16 0:02546065 — 00455186 + 0001158 0005865 
17 0:02739094 — (00248832 + ‘0000682 0006547 
18 0:02996700 + 00573797 — 0001720 *0004827 
19 0:03360181 — (00124666 + 0000419 0005246 
20 0:03803862 — 00454994 +:0001731 | *0006977 
21 0:04355963 + *00382135 — ‘0001665 0005312 
22 0:05068120 + 00204641 — ‘0001037 0004275 
23 0:05968962 — 00471305 + 0002813 *0007089 
24 0:07107603 + 00066657 — ‘0000474 ‘0006615 
25 0:08566837 + °00406751 — ‘0003485 -0003130 
26 0°10443752 — 00276906 + 70002892 ‘0006022 
Pry| 0°12866399 — 00240728 + 0003097 “0009119 
28 0°16018283 + :00383980 — ‘0006151 ‘0002969 
29 0°20147298 + :00036667 — ‘0000739 “0002230 
30 0°25589543 — 00382481 + 0009788 0012017 


True value :0005850. 


near to the ‘true value’ line, the approximation is not really a good one. The 
important question for us is: To how many decimal places does the series give the 
result correct ? On going through the tables it will be found that there is no value 
of the series up to the sth term giving the result correct to more than three or four 
places. We now come to the real trouble. Suppose a frequency function is expanded 
in tetrachoric series, how are we to know at what term to stop so as to obtain the 
most accurate result? If the value of an integral is required, the true value is 
wanted. In our work we chose integrals of which the value was already known. 
From Figs. 1—4 it is easily seen that we have as good an approximation at the 


pata B’ _ 29°4-48 
Vp (N48 


— 2°6846788. 
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42 v8 e-& 
TABLE IV. | edu, 2=—8660254%. 
o P(49) 
| Tetrachorie Terms in Value of Series 
| Functions T. Series —a,’7, up to term 7, 
0) "1932381 1932381 1932381 
1 + °27418875 — ‘0395757 "1536624 
2 — 16790564 + 0049470 *1586093 
3 — °02798427 + ‘0035041 *1621134 
4 + *10905792 — 0067248 *1553886 
5 — (02346554 + :0006216 *1560102 
6 — 07134833 + 0030225 "1590328 
7 + 04145828 — 0014346 *1575982 
8 + 04451198 — ‘0010188 *1565794 
9 — 04705083 + 0011839 *1577634 
10 — °02465039 + 0006135 "1583768 ’ 
11 + 04681171 — ‘0010000 *1573768 
12 + °00975248 — 0002114 "1571654 
13 — ‘04356976 + 0009902 "1581556 
14 + 00140962 — :0000318 *1581238 
15 + °03877058 — 0009117 Oar) OY) 
16 — ‘00966795 + 0002462 *1574583 
17 — 03323150 + 0009102 “1583686 
18 | +:°01562623 — '0004683 *1579003 
19 + 02744360 — ‘0009222 1569781 
20 | —:01974338 + :0007510 1577291 
21 — °02171195 + 0009458 "1586749 
22 + 02237974 — ‘0011342 *1575407 
23 + 01622818 — ‘0009687 *1565720 
24 — 02382475 + 0016934. *1582654 
25 —°01111122 + :0009519 *1592173 
26 + 02431475 — (0025394 *1566779 
27 + 00643169 — ‘0008275 "1558504 
28 — 02404492 + 0038516 *1597020 
29 — 00222729 + 0004487 *1601507 
30 + 02317774 — 70059311 °1542196 
True value °1577387 
5 til zs x)! 
TABLE V. | fake Sey: ae a zs = 
» BCS,5) dx, y 6457513, p=15, qg=5, m=20F. 
Fy i Tetrachoric Terms in Value of Series 
2 Functions 7, Series — agrg up to term Tr, 
| 
| 
(0) 100000000 ‘0040751 0040751 | ‘0040751 
3 — ‘19638608 + 02950904 + 0057952 | ‘0098703 
4 + °01452267 — 02602453 + :0003780 ‘0102482 
5 + °03818545 + *01099737 — 0004199 0098283 
6 + °05515045 + :00712711 — 0003931 -0094352 
7 — °01389639 — ‘01561177 — 0002170 ‘0092183 
8 — ‘03609105 + :02031787 + :0007333 *0099516 
True value ‘0096054. 
zp’ 42-48 Soe 5-75 
KS SD, fe Wey: (Doel Pe Wie _— _ 9-e4F 
z Fy Saas 8660254. + {———— = -99c4010 = 2°6457513. 
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TABLE VI. 
5 (1 —ax)t 
PANTY de,_y = — 13010412... p= 4, og = Ss em 5T* 
|, B (4, 3) ¥y P q 2 2 
Tetrachoric Terms in © Value of Series | 
: as Functions rT, Series — ag7, up to terms 7, 
0) 100000000 ‘0966212 "0966212 0966212 
3 — 28327885 + °04839695 + °0137098 ‘1103310 
4 — *01400852 + 05941568 + °0008323 "1111633 
i) + ‘16688842 — ‘06703628 +:0111876 *1223509 
| 6 + +05349154 — ‘00778490 + 0004164 °1227673 
led — ‘05325140 +:°05554783 | +:0029580 1257253 
8 — 09445982 — 01930950 — 0018240 °1239013 
9 — ‘00063525 — '03745046 — 0000238 1238775 
True value °1188790. 
TABLE VII. 
193 (1 —x)t 
——__— dz, =— 3°59087385 TF. 
| » Bias y i 
Tetrachoric Terms in Value of Series 
: as Functions rT, Series —dgrT5 up to term 7; 
0) 100000000 “0001648 ‘0001648 0001648 
33 — ‘28327885 + *00307042 + :0008698 *0010346 
4 — ‘01400852 — ‘00458580 — ‘0000642 *0009704 
5 + ‘16688842 +:00530458 | -- ‘0008853 -0000851 
6 + ‘05849154 — (00442734 + °0002368 ‘0003219 
7 — 05325140 + 700191632 | +:0001020 “0004239 
8 — ‘09445982 = +:00111687 | +°0001055 *0005294 
) ‘00063525 — 00291774 — ‘0000019 0005275 


5th or 6th term as 


True value ‘00023603. 


representation of the Incomplete ['-function. 


beer iP. 
ec Oe Me 
: o¢ 17468526 
P 
fn clus 
ty= Dads ae 


o 17468526 _ 


~= — 1°3010412., 


— 3°59087385. 


at the 15th, say, and better than at the 30th. Of course, one 
might calculate the various terms till the sums became more or less steady, take 
the mean of these sums after the steady stage is reached and use that as the value 
required. This process, however, will not give a greater accuracy than three or four 
decimal places correct and very likely the result will not be so good as that. Besides 
which it is difficult to give such an arbitrary weighting of terms a theoretical 
justification. Thus it seems that the tetrachoric series is not at all suitable for the 
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When we consider the tables and graphs for the Incomplete B-function, the 
results are certainly no better than in the case of the Incomplete I-function. 
Unfortunately, owing to the lack of a difference formula connecting the successive 
coefficients, we only calculated a few terms, but the behaviour of the graphs is 
similar to that of the graphs of the Incomplete T-function. Fig. 5 is very like 
Figs. 1—4 but Figs. 6 and 7 are rather different. In Fig. 5 the integral is 
[’ a(1— 2)! 


B (15, 5) dx, where p is of high value and q is of moderate size. In Figs. 6 and 7 
/ 0 iguce 


S| ae 
the integral is | sae 
4 - 0 > 2 


Here pis 4and q is 3. It seems in the incomplete I'- and B-functions that the 
points come nearer the ‘true value’ line for the tail of the integral than if the 
upper limit is near the mode. 


dx, where the upper limits are ‘5 and ‘1 respectively. 


TRUE VALUE 


Tz Ty Ts Te 17 Tg 
NUMBER OF TERMS 


Fig. 5. 
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Fig. 6. Fig. 7. 
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; 49 p48 p—X : rae 49 —4 

Table VIII gives the results for v  dee and, since z = ha a le 0 
for the expansion from the mean, all the tetrachoric functions of even order vanish. 
It will be observed that the values of the series vary in a similar fashion to the 


others and not one of these gives the result correct to more than four decimal places. 


TABLE VIII. 


49 eB e-* i 
i; rio) ™ Zire 


(Hapansion with regard to the Mean.) 


| 

Tetrachoric | Terms in Value of Series 

& as Functions rs Series — dgTs up to term rf, 
0 100000000 “5000000 “5000000 | *5000000 
3 ‘11664237 — 1628675 +°0189973 | *5189973 
by °00638743 + :°1092549 — ‘0006979 5182994 
i 01785148 — 0842920  +:0015047 | *5198041 
9 | ‘01470566 + (0695373 | — °0010226 *D187815 
11 | ‘00895618 — 0596711 + 0005345 | *5193160 
13 ‘01079260 + °0525526 — ‘0005672 ‘5187488 
15 °01015854 — 0471442 | + 0004789 5192277 


True value °51899938. 


After a careful study of the tables and graphs we are forced to the conclusion 
that a tetrachoric series is of no practical utility as a representation of skew 
frequency curves such as y= ya? te and y= ya" (1 — a)", and although it 
may be rash to generalise from our results on these two types it would seem 
that such a series cannot be generally suitable to represent skew frequency dis- 
tributions. Moreover, the types, which have been discussed, are of common occur- 
rence and for these the expansion is certainly futile. 


The true values of the incomplete ['-function were taken from Tables of the 
Incomplete U-function which will be shortly issued by H.M. Stationery Office. The 
values of the incomplete B-function were determined by direct calculation; the 
power of (1 — x) was expanded and the result readily obtained with the help of the 
relation 

~~ T(pt+q 

In his Vorlesungen tiber die Grundziige der mathematischen Statistik (Hamburg, 
1920) Charlier, when dealing with skew frequency curves, gives as the general 
equation for the skew frequency curves of his Type A 


= hy at Bebe ar Bbc” aus Bshv" = tennis 


¥ w-p 49-49 
£5 -= —~— =0. 
Vp 7 


B®) 
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where $= = e-#** and dy”, dol’, GY, ... are the third, fourth, fifth, etc. ditfer- 
7 
ential of coefficients $y, i.e. Y is really expressed in a series of tetrachoric functions, or 
Y=5 (7,—- 8, V4!7,4+ 8, V5!75— Bs V6! t.—...}. 
Bs, Bs, Bs, ete. along with M (the mean) and o Charlier calls the ‘ characteristics’ 
of the distribution curve. Now he seems to think that generally the coefficients 
3 4 

8; and ®,* will only be required and so he has tabled ¢, (2), ae ; Be for c= 00 
to 3:00 at intervals of ‘01 and also for « = 4 (‘Tables III, IV and V on pp. 128—125) 
to four decimal places. With the series up to 8, the theoretical Y-coordinate will 
be found, according to Charlier, but from our experience of tetrachoric functions 
we are exceedingly sceptical about the accuracy of such a result. In fact, we feel 
certain that the approximation will not be a good one. If the frequency curve 
be little different from the normal then possibly the approximation would not be 
very bad. 


The above investigation was undertaken by me at the suggestion of Professor 
Pearson and I am indebted to him for several hints. My grateful thanks are due 
to Miss I. M*Learn for her assistance in the preparation of the diagrams. 


* Charlier defines the ‘skewness’ S to be S=3£3 and the ‘excess’ H to be H=3(,. 


MISCELLANEA. 


I. On the x? test of Goodness of Fit. 
By KARL PEARSON, F.R.S. 


In a paper published in the Philosophical Magazine for July 1900, pp. 157—175, I dealt with 
the following problem: A very large population is sampled, say, the population 7, ng, ... 25, ... Np 
with total V, and any individual sample is m1, mg, ... m,, ... Mp, total MW. The “ probable constitu- 
tion” is given by : 

, MM ah 


, , * ~ 
my =n Ms, = 7,712 p60. MR a) SOMO a0) 
1 1» 2 ’ 8 N 8) 


Ty N Doe ae 
If a large number of samples of size / are taken, what is the distribution of variations from 
the “probable constitution” in these samples? 
I showed that if the distribution of categories were such that no category contained a few 
(Ms — Mg)? 


isolated units, then the distribution depended on the calculation of y2=S7 Pe and pro- 
8 


vided a value for the probability P that samples would not diverge more than any given sample 
from the “ probable constitution.” This process is now familiar to statisticians as the x”, P test. 

The sole limiting conditions were that the samples should be random, and each should be of 
the same size J. 

In some cases the “ probable constitution” (m’ series) can be found at once because the dis- 
tribution of the sampled population is known a prior?. In other cases the values of the m’ series 
have to be approximated to, and such approximations are the general rule in all discussions of 
probable error. 

We say for example that the standard deviation of the mean of a sample taken from an 
indefinitely large population of size V and standard deviation o is ¢/n, where n is the size of 
the sample. 

We say that the standard deviation of second moment-coefficients of samples of size 7 is 


J Pa — Be” 
Vn 
where py (=o) and py are the second and fourth moment-coefficients of the population sampled. 
In fact every constant of the sample has a probable error determinable in terms of the constants 
of the sampled population. All these distributions of deviations from “probable constitution ” 
are true for perfectly general but random samples of size » drawn from our indefinitely large 
population. 


’ 


But unfortunately in a considerable number of cases that sampled population is unknown to 
us ; we have no direct means of finding po, py, etc. What accordingly do we do? Why we replace 
the constants of the sampled population by those calculated from the sample itself, as the best 
information we have. And the justification of this proceeding is not far to seek. yw, as found for 
the sample will only differ from the p, of the sampled population by terms of the order 1//n; 
for example if we are not dealing with smali samples, and o’ be the standard deviation of the 


sample, o’ differs from o by terms of the order o/V2n and accordingly the standard deviation of 


the mean is written o//n when it is really o/z. This method of treating probable errors is 
universal in the case of fair sized samples to-day and scarcely needs justification. In writing the 
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sample values of the constants for those of the sampled population, we do not i any way alter 
our original supposition that we are considering the distribution of random samples of size n. 
We have still » — 1 degrees of freedom, if we have p categories of frequency. 


The process of substituting sample constants for sampled population constants does not mean 
that we select out of possible samples of size n, those which have precisely the same values of 
the constants as the individual sample under discussion. Clearly the given sample has definite 
moment-coefficients, and if there be p frequency categories the first p—1 moment-coefticients 
together with the size n of the sample would suffice to fix all the frequencies of the p categories*. 
Hence no deviations from the “probable constitution” would be possible if we confined our 
attention to samples of 7 tied to the constants of the given sample! In using the constants of 
the given sample to replace the constants of the sampled population, we in no wise restrict the 
original hypothesis of free random samples tied down only by their definite size. We certainly do 
not by using sample constants reduce in any way the random sampling degrees of freedom. 


What we actually. do is to replace the accurate value of y?, which is unknown to us, and 
cannot be found, by an approximate value, and we do this with precisely the same justification as 
the astronomer claims, when he calculates his probable error on his observations, and not on the 
mean square error of an infinite population of errors which is unknown to him. The whole of this 
matter was very fully discussed (pp. 164—7) in ny original paper dealing with the x2, P test. 


The above re-description of what seem to me very elementary considerations would be 
unnecessary had not a recent writer in the Journal of the Royal Statistical Society t appeared to 
have wholly ignored them. He considers that I have made serious blunders in not limiting my 
degrees of freedom by the number of moments I have taken; for example he asserts (p. 93) 
that if a frequency curve be fitted by the use of four moments then the n’ of the tables of 
goodness of fit should be reduced by 4. I hold that such a view is entirely erroneous, and that 
the writer has done no service to the science of statistics by giving it broad-cast circulation in 
the pages of the Journal of the Royal Statistical Society. 


What he would obtain if he placed this restriction on his samples is not the x? for the distri- 
bution of samples of size n, but of samples which give definite moments. The absurdity of this 
manner of approach is at once obvious, if as I have suggested, we consider the p first-moments, 
as there is no reason why we should not do,—for these are just as much “fixed” as the first four— 
and the conclusion must be that we can learn nothing at all about variation from our sample ; 
for we have p frequency groups and p-tying conditions. 


When we wish to find the probable error of a mean or a standard deviation, we do not start 
by fixing down these characters to their values in the individual sample; we suppose them 
to take all the possible values they could take by sampling, and after we have reached our 
measure of variation we then put into our formula the sampled values, to give an approximate 
value to the functions reached, because we are in ignorance of the real values in the sampled 
population. 

The writer in the Journal of the Royal Statistical Society speaks as if I applied y? to a con- 
tingency table starting by fixing the marginal totals. As far as I am aware I am not guilty of 
this. My conception of contingency is very different from my conception of x”. I started my 
conception of contingency with the idea not of a random sample, but with the idea that some 
function of frequencies alone without regard to their relation to the measured characters would 
lead to the value of the correlation. Naturally I started from the deviation of the individual cell 
contents from the same cell contents on the basis of independent probability, as determined by 
the marginal totals. There was no question of sampling in the matter. In now fairly usual 
notation I termed 


* This is Thiele’s method of representing frequency distributions. 
+ Vol. uxxxv. p. 87, 1922. 
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the cell contingency and after playing about with such cell contingencies for a time succeeded in 
finding a function ¢? of them which for indefinitely fine grouping for a bi-variate normal frequency 


distribution gave the correlation 7 as : 
Nese 
T= ssa? 
1+¢ 


( 5 Meme 
n te ae ee 
where =. 8 bs dececuneee snes Te eeetee pabbadeoGA00 (a). 


MN gs Mes! 


Mi 


I see no reason for confusing this ¢? as a measure of correlation with the y? which is a measure 
of variability in the samples of constant size drawn from an indefinitely large population. It was 
different in its origin, as far as I am concerned, and different in its use. It is only when we come 
to consider the probable error of ¢? that we have to distinguish between (a) the actual marginal 
totals of the sample and (6) the probable constitution of the marginal totals as deduced from an 
indefinitely large sampled population. 


There are, as those who have read Biometrika* will recognise, considerable difficulties about 
determining the probable error of o”, where 


1 +$2=8( a) ; 


Mg,M,5/ 
and the determination of the mean ¢? and of the standard deviation of 2 involves very trouble- 
some analysis. 

So laborious is the arithmetic involved that for ordinary statistical use it became doubtful 
whether it would not be better to define ¢? as the mean squared contingency measured not from 
the marginal totals of the sample, but from the “probable constitution” of the marginal totals 
of the sample as deduced from the sampled population. In this case if 


7 A ; ie IE 
m ss = Fy Ma's m se Hy Mews TM s9i = Hy Mas!» 
M5, os) 
Wea eS 
ts = (8) 
p= SSUES ieee meee EEE EEE EEE saaeenene Jeo cceaseeeene daeeeeene 
ye nts 
‘“ M 
2 
m2," 
or, 1+¢?=8 = Cs ; 
MW 541 yg! 


with this change of definition the probable error and mean of ¢? are more easily obtainable, and 
in this case for the first time, Mp? can be looked upon as equivalent to a 2. 


The form (a) from my standpoint cannot be treated as a yx”, because it is not the deviation- 
measure of a given sample from the sampled population. Nor again is (8) the deviation-measure 
of the sample from the sampled population, unless we assume that population to have zero 
contingency, i.e. 7's = m’,,m’./M. : 

But x? may in the form (8) be treated as a deviation-measure of the actual sample from an 
artificial sampled population, which differs from the actual population in having no correlation 
or contingency, but having the same marginal distributions of the two characters. 

The moment, however, we assume form (8) for our contingency we are giving, what we clearly 
must give, absoltite freedom to the marginal totals of-our samples. The sole limit on our sample 
is its total size JZ But when we come to actually calculating ¢? for the individual sample, or the 
mean value or the standard deviation (i.e. probable error) of ¢? for a series of samples, we have 
only one course open to us, if we do not know the constants of the sampled population, we must 
insert the marginal totals of the individual sample of which we have cognizance in place of the 


* Vol. v. p. 191, Vol. x. p. 570, Vol. x1. p. 570, and Vol. xir. p. 259. 
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unknown values of the sampled population. Thus (a) and (8) provide ultimately the same #?, 
but the probable error of ¢? and the mean value of ¢? will be different in the two cases. In the 
first case we vary our marginal totals with the sample as they obviously would vary in practice. 
In the second case we define our ¢? to be a deviation from the independent probability of an 
artificial population, we do not keep the marginal totals of the sample fixed any more than in (a). 
But if we think in terms of x? (and not $?) we appear to do so because ultimately we have to take 
our marginal probabilities as those of the sample in default of a knowledge of any better values. 
This point seems to me well illustrated in what my critic in the Journal of the Royal 
Statistical Society has to say on p. 90 of his paper about Messrs Greenwood and Yule’s use of x? 
for a fourfold table. He asserts that they ought to have entered the table of goodness of fit with 
n'=2. The problem before them was whether their fourfold tables could possibly be samples of 
bi-variate independent probability distributions. Each sample from such a distribution would 
have perfectly free cell frequencies 741, 7742, M1, M22, Subject to the sole binding condition that 


as) 


My +My + Mg + Mo = M. 
The proper y? is given by 


4 m'1.m' \? . m1.’ .\? - ms, 4? i m’s,1'9\? 

(i= Om SS WD = $y i 

ple) ee el, ae M 

| 
m3, 4 m1.» M's, My M9, vy 

M M M M 


and this has three degrees of freedom and is what Messrs Yule and Greenwood desired to find, 
and they properly used the value of P for n/=4. 


-(y), 


Then like the astronomer, who finding the probable error of his mean to be *674490/,/M and 
not knowing the o of his sampled population, puts it equal to the o of his observations, so 
Messrs Yule and Greenwood very properly replaced the marginal totals of their unknown 
population by those of their sample, but very properly did not replace n’=4 by n’=2 !. 

But says my critic*, if they had, they would have got the same measure of improbability as if 
they had compared the difference of percentages! Quite so, and obviously so; for in taking 
percentages they have actually fixed their marginal totals taking 100 of each class and thus for 
the first time confined their attention to a limited class of samples, not the random sample of 
size Mf, which has not its marginal totals fixed. We have, indeed, reduced our degrees of freedom 
by two in taking ratios. 

When we consider generally the y? for a fourfold table to measure the improbability of a 
sample we are really comparing the special sample 


a || <6 a+b with a b! a’ +b’ 
c d c+d c d cé+d' 
pee |) ECs M one || inal M 


the general population, where in the latter case a’d’=c'b’. 


Now the mean square contingency of the first of these tables is 


((« eet (6 Caney (Ge ey (acter 


are al M M NM 
p a ae ee See a 
M M M M 
a b C a? 
= te +b) (a+e) (aq b) (b+d) Tae) (c+d) LaiCeers (b+d) i} 
(ab — ed)? 


~ (a+b) (a+e)(b+d)(e+d)’ 
eelocacuepsgU: 


190 Miscellanea 


But the y? is 
(« _ ey (- (a' + 0) (+ =); (c a reer ey (a- eee 


_ M i M sf M M 
a (a +0’) (a’ +c’) (a+b) (U' +d) (a’ +c’) (ce +d’) s (c’ +a’) (b'+d’) 


M M M M 


ue | Laan + oe Sa eos tee Gar ee i} 
a (a’ +6) (a’ +e) © (a+b) (8 4a’) (V+e)(o4+d) (+d) (6' +a’) y 
there being three degrees of freedom or we must take n’=4 in calculating the probability P, this 
may be written 

= | a + b | ge — a i| 
ne PaPie Dad’ padi. p'oP'r 
where p’,1, P'.2, p'1. and p's, are the four percentage numbers of the marginal categories in the 
sampled population. Now we do not know these percentages in that population and we do what 
every physicist, every astronomer, and—till I saw the paper by my critic in the Journal of the 
Statistical Society I should have said—every statistician does, supply the unknown constants 
from the sample, which leads us to 


M(ab—edy? 


(a+b) (a+) (64d) (ed) 


as used in my memoir of 1912*, 


The problem I had and still have in view is the variability in samples of definite size—with 
no other restriction than sample size. The solution of that problem is absolutely comparable 
with that of any discussion of the probability of an observed result in the theory of probable 
errors. We have in the bulk of such cases constants involved which concern the distribution in 
an unknown population, and we supply those constants from the sample itself. 


As I have already noted the probable error of a mean is 

67449 Jug — py? 

Ne 
By this we understand that the means of samples restricted solely by their size M@ from an 
indefinitely large population of moment-coefficients py’, po’ about a fixed origin will have a 
variability determined by the above formula. But when we proceed to give both pu,’ and po’ the 
values determined from the sample we know, we do zo¢ add in the manner of my Royal Statistical 
Society critic, “but in doing so the type of samples is reduced to those having the mean and 
standard deviation of the sample.” If we did, this selection of samples would clearly have no 
variation of mean or standard deviation at all! In fact probable errors would be meaningless, 


unless we drew our samples from a population already fully known to us, in which case we should 
not in 99°/, of cases want to sample it at all. 


In the same way when we use the marginal totals of the sample in formulae like (6) we do not 
thereby reduce our samples to those having constant marginal totals, we merely take the best 
approximation available to the proper value of x”, and the fact that y, as found from the sample, 
is only an approximation to the true x? was fully recognised and discussed in my original memoir 
in the Philosophical Magazine. 


It only remains to say that the following sentence of my critic’s paper seems to me based 
upon a fallacious principle and apparently flows from a disregard of the nature of probable 
errors in general. 


“Tt should be pointed out that certain of Pearson’s Tables for Statisticians and Biometricians, 
namely Tables XVII, XIX and XX, together with XXII (Abac to determine 77’) are all calculated 


* On a novel method of regarding the association of two variates classed solely in alternative 
categories. Drapers’ Company Research Memoirs, Cambridge University Press. 
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on the assumption that n’=4 in fourfold tables, and consequently should not be used when, as is 
almost always the case, the marginal totals are obtained from the data” (loc, cit. p. 91). 


I hold those tables are quite correctly calculated for n’=4, and those who attempt to modify 
them by assuming n’=2 will be dealing with an entirely different problem. Namely, they will 
be considering not the improbability of the given sample as one of all possible samples of the 
given size, which it really is, but one of the indefinitely smaller number of samples that have 
fixed marginal totals. We do not find the probable error of 7 for a tetrachoric table* on the 
assumption that the marginal totals are fixed. We find it on the assumption that the marginal 
totals also vary from sample to sample, and when we have found it, then we substitute in the 
result the values of not only the marginal totals, but the cell-contents, a, b, c, d of the sample 
itself for those of the unknown population. With y? we go through an exactly similar process of 
reasoning. If by this procedure we in some mysterious manner tied our degrees of freedom down 
to the values of the cell-contents used in our formula and adopted from our sample there could 
be no probable error for 7, for the values of a, b, c, and d are all required and used. I trust my 
critic will pardon me for comparing him with Don Quixote tilting at the windmill; he must either 
“destroy himself, or the whole theory of probable errors, for they are invariably based on using 
sample values for those of the sampled population unknown to us. For example here is an 
argument for Don Quixote of the simplest nature: In the sth category of a population WV the 
frequency is 2,, a sample shows m, in a total JZ The standard deviation of this frequency is 


Ny [, Mg 
Ret -%). 


But we don’t know the population sampled and accordingly obtain an approximate value of the 


above standard deviation by writing for as a and taking for the standard deviation of m, 


m F ae : : : 
ot Ms ( - a) . In doing this it is not a question even of using a marginal total, we have used 


a cell frequency found from our sample. We have therefore according to our critic reduced our 
possibilities of freedom by selecting out of all possible samples those with m, in the sth cell—this 
is exactly parallel to our reducing our freedom by “fixing” marginal proportions or moment- 
coefficients. But if m, be fixed, it is ridiculous to talk of'a variation of the m, frequency. There- 
fore either m,=0 or m,= J, or the usual theory and practice of probable errors are wholly at 
fault. I think this will illustrate what I mean by Don Quixote and the windmill. 


II. 


Is Tuberculosis to be regarded from the Aetiological Standpoint as an acute disease 
of Childhood? By Dr Kr. F. ANDVorD (Christiania). Tubercle, Vol. 11. No. 3, 
December, 1921. 


This paper is, we must confess, unconvincing. The author holds that in a community that 
has long been subject to tuberculosis the time of infection should be fixed in the infantile years 
for the great majority of cases and consequently we should protect children for the first three or 
four years from infection. 


As evidence of his views he takes a graph of what he calls a “population frame” which is 
really the well-known ‘number living in a stationary population ” (/,,) and represents within this 
graph the numbers dying from tuberculosis and the numbers who have suffered from it at each 
age. We are doubtful if his graphs for deaths are correctly drawn. They are made to rise 
suddenly for about a year and then fall till age 7 but we suspect that they should fall from birth 
till age 7. We cannot justify his chart (No. VIII) which gives the whole population and the 


* Phil. Trans. Vol. 195A, p. 14. 
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tubercular population. The non-tubercular found by this chart actually increase after age 17 for 
many years so that the non-tubercular not only have no mortality but are increased by some 
process of resurrection! Admittedly the chart is hypothetical but as it stands it calls for 
amendment. 

Dr Andvord’s remark that “one would hardly gather from these per-thousand curves,” i.e. 
from rates of mortality for various ages, “that, as is really the case, more persons die from 
tuberculosis in the first and second years of life than in any subsequent age period” seems to 
betray an inexperience in matters related to a life table: this weakness is shown elsewhere, e.g. 
p. 102, where deaths are stated without populations and without reference to age distributions. 


Dr Andvord may have other evidence in support of his views but the article under review 
does not justify them statistically ; we think every point he brings out could be explained as 
well on other hypotheses. He cannot, moreover, completely prove his case till he has studied 
communities which become subject to infection after having been kept free from it. For if his 
theory be correct, the measures he proposes would necessarily produce such a community. 


W. Pain ELDERTON. 
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